home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-01-05 | 185.5 KB | 5,925 lines |
- .rn '' }`
- ''' $RCSfile: perl.man,v $$Revision: 4.0.1.5 $$Date: 91/11/11 16:42:00 $
- '''
- ''' $Log: perl.man,v $
- ''' Revision 4.0.1.5 91/11/11 16:42:00 lwall
- ''' patch19: added little-endian pack/unpack options
- '''
- ''' Revision 4.0.1.4 91/11/05 18:11:05 lwall
- ''' patch11: added sort {} LIST
- ''' patch11: added eval {}
- ''' patch11: documented meaning of scalar(%foo)
- ''' patch11: sprintf() now supports any length of s field
- '''
- ''' Revision 4.0.1.3 91/06/10 01:26:02 lwall
- ''' patch10: documented some newer features in addenda
- '''
- ''' Revision 4.0.1.2 91/06/07 11:41:23 lwall
- ''' patch4: added global modifier for pattern matches
- ''' patch4: default top-of-form format is now FILEHANDLE_TOP
- ''' patch4: added $^P variable to control calling of perldb routines
- ''' patch4: added $^F variable to specify maximum system fd, default 2
- ''' patch4: changed old $^P to $^X
- '''
- ''' Revision 4.0.1.1 91/04/11 17:50:44 lwall
- ''' patch1: fixed some typos
- '''
- ''' Revision 4.0 91/03/20 01:38:08 lwall
- ''' 4.0 baseline.
- '''
- '''
- .de Sh
- .br
- .ne 5
- .PP
- \fB\\$1\fR
- .PP
- ..
- .de Sp
- .if t .sp .5v
- .if n .sp
- ..
- .de Ip
- .br
- .ie \\n(.$>=3 .ne \\$3
- .el .ne 3
- .IP "\\$1" \\$2
- ..
- '''
- ''' Set up \*(-- to give an unbreakable dash;
- ''' string Tr holds user defined translation string.
- ''' Bell System Logo is used as a dummy character.
- '''
- .tr \(*W-|\(bv\*(Tr
- .ie n \{\
- .ds -- \(*W-
- .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
- .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
- .ds L" ""
- .ds R" ""
- .ds L' '
- .ds R' '
- 'br\}
- .el\{\
- .ds -- \(em\|
- .tr \*(Tr
- .ds L" ``
- .ds R" ''
- .ds L' `
- .ds R' '
- 'br\}
- .TH PERL 1 "\*(RP"
- .UC
- .SH NAME
- perl \- Practical Extraction and Report Language
- .SH SYNOPSIS
- .B perl
- [options] filename args
- .SH DESCRIPTION
- .I Perl
- is an interpreted language optimized for scanning arbitrary text files,
- extracting information from those text files, and printing reports based
- on that information.
- It's also a good language for many system management tasks.
- The language is intended to be practical (easy to use, efficient, complete)
- rather than beautiful (tiny, elegant, minimal).
- It combines (in the author's opinion, anyway) some of the best features of C,
- \fIsed\fR, \fIawk\fR, and \fIsh\fR,
- so people familiar with those languages should have little difficulty with it.
- (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
- even BASIC-PLUS.)
- Expression syntax corresponds quite closely to C expression syntax.
- Unlike most Unix utilities,
- .I perl
- does not arbitrarily limit the size of your data\*(--if you've got
- the memory,
- .I perl
- can slurp in your whole file as a single string.
- Recursion is of unlimited depth.
- And the hash tables used by associative arrays grow as necessary to prevent
- degraded performance.
- .I Perl
- uses sophisticated pattern matching techniques to scan large amounts of
- data very quickly.
- Although optimized for scanning text,
- .I perl
- can also deal with binary data, and can make dbm files look like associative
- arrays (where dbm is available).
- Setuid
- .I perl
- scripts are safer than C programs
- through a dataflow tracing mechanism which prevents many stupid security holes.
- If you have a problem that would ordinarily use \fIsed\fR
- or \fIawk\fR or \fIsh\fR, but it
- exceeds their capabilities or must run a little faster,
- and you don't want to write the silly thing in C, then
- .I perl
- may be for you.
- There are also translators to turn your
- .I sed
- and
- .I awk
- scripts into
- .I perl
- scripts.
- OK, enough hype.
- .PP
- Upon startup,
- .I perl
- looks for your script in one of the following places:
- .Ip 1. 4 2
- Specified line by line via
- .B \-e
- switches on the command line.
- .Ip 2. 4 2
- Contained in the file specified by the first filename on the command line.
- (Note that systems supporting the #! notation invoke interpreters this way.)
- .Ip 3. 4 2
- Passed in implicitly via standard input.
- This only works if there are no filename arguments\*(--to pass
- arguments to a
- .I stdin
- script you must explicitly specify a \- for the script name.
- .PP
- After locating your script,
- .I perl
- compiles it to an internal form.
- If the script is syntactically correct, it is executed.
- .Sh "Options"
- Note: on first reading this section may not make much sense to you. It's here
- at the front for easy reference.
- .PP
- A single-character option may be combined with the following option, if any.
- This is particularly useful when invoking a script using the #! construct which
- only allows one argument. Example:
- .nf
-
- .ne 2
- #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
- .\|.\|.
-
- .fi
- Options include:
- .TP 5
- .BI \-0 digits
- specifies the record separator ($/) as an octal number.
- If there are no digits, the null character is the separator.
- Other switches may precede or follow the digits.
- For example, if you have a version of
- .I find
- which can print filenames terminated by the null character, you can say this:
- .nf
-
- find . \-name '*.bak' \-print0 | perl \-n0e unlink
-
- .fi
- The special value 00 will cause Perl to slurp files in paragraph mode.
- The value 0777 will cause Perl to slurp files whole since there is no
- legal character with that value.
- .TP 5
- .B \-a
- turns on autosplit mode when used with a
- .B \-n
- or
- .BR \-p .
- An implicit split command to the @F array
- is done as the first thing inside the implicit while loop produced by
- the
- .B \-n
- or
- .BR \-p .
- .nf
-
- perl \-ane \'print pop(@F), "\en";\'
-
- is equivalent to
-
- while (<>) {
- @F = split(\' \');
- print pop(@F), "\en";
- }
-
- .fi
- .TP 5
- .B \-c
- causes
- .I perl
- to check the syntax of the script and then exit without executing it.
- .TP 5
- .BI \-d
- runs the script under the perl debugger.
- See the section on Debugging.
- .TP 5
- .BI \-D number
- sets debugging flags.
- To watch how it executes your script, use
- .BR \-D14 .
- (This only works if debugging is compiled into your
- .IR perl .)
- Another nice value is \-D1024, which lists your compiled syntax tree.
- And \-D512 displays compiled regular expressions.
- .TP 5
- .BI \-e " commandline"
- may be used to enter one line of script.
- Multiple
- .B \-e
- commands may be given to build up a multi-line script.
- If
- .B \-e
- is given,
- .I perl
- will not look for a script filename in the argument list.
- .TP 5
- .BI \-i extension
- specifies that files processed by the <> construct are to be edited
- in-place.
- It does this by renaming the input file, opening the output file by the
- same name, and selecting that output file as the default for print statements.
- The extension, if supplied, is added to the name of the
- old file to make a backup copy.
- If no extension is supplied, no backup is made.
- Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
- the script:
- .nf
-
- .ne 2
- #!/usr/bin/perl \-pi.bak
- s/foo/bar/;
-
- which is equivalent to
-
- .ne 14
- #!/usr/bin/perl
- while (<>) {
- if ($ARGV ne $oldargv) {
- rename($ARGV, $ARGV . \'.bak\');
- open(ARGVOUT, ">$ARGV");
- select(ARGVOUT);
- $oldargv = $ARGV;
- }
- s/foo/bar/;
- }
- continue {
- print; # this prints to original filename
- }
- select(STDOUT);
-
- .fi
- except that the
- .B \-i
- form doesn't need to compare $ARGV to $oldargv to know when
- the filename has changed.
- It does, however, use ARGVOUT for the selected filehandle.
- Note that
- .I STDOUT
- is restored as the default output filehandle after the loop.
- .Sp
- You can use eof to locate the end of each input file, in case you want
- to append to each file, or reset line numbering (see example under eof).
- .TP 5
- .BI \-I directory
- may be used in conjunction with
- .B \-P
- to tell the C preprocessor where to look for include files.
- By default /usr/include and /usr/lib/perl are searched.
- .TP 5
- .BI \-l octnum
- enables automatic line-ending processing. It has two effects:
- first, it automatically chops the line terminator when used with
- .B \-n
- or
- .B \-p ,
- and second, it assigns $\e to have the value of
- .I octnum
- so that any print statements will have that line terminator added back on. If
- .I octnum
- is omitted, sets $\e to the current value of $/.
- For instance, to trim lines to 80 columns:
- .nf
-
- perl -lpe \'substr($_, 80) = ""\'
-
- .fi
- Note that the assignment $\e = $/ is done when the switch is processed,
- so the input record separator can be different than the output record
- separator if the
- .B \-l
- switch is followed by a
- .B \-0
- switch:
- .nf
-
- gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
-
- .fi
- This sets $\e to newline and then sets $/ to the null character.
- .TP 5
- .B \-n
- causes
- .I perl
- to assume the following loop around your script, which makes it iterate
- over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
- .nf
-
- .ne 3
- while (<>) {
- .\|.\|. # your script goes here
- }
-
- .fi
- Note that the lines are not printed by default.
- See
- .B \-p
- to have lines printed.
- Here is an efficient way to delete all files older than a week:
- .nf
-
- find . \-mtime +7 \-print | perl \-nle \'unlink;\'
-
- .fi
- This is faster than using the \-exec switch of find because you don't have to
- start a process on every filename found.
- .TP 5
- .B \-p
- causes
- .I perl
- to assume the following loop around your script, which makes it iterate
- over filename arguments somewhat like \fIsed\fR:
- .nf
-
- .ne 5
- while (<>) {
- .\|.\|. # your script goes here
- } continue {
- print;
- }
-
- .fi
- Note that the lines are printed automatically.
- To suppress printing use the
- .B \-n
- switch.
- A
- .B \-p
- overrides a
- .B \-n
- switch.
- .TP 5
- .B \-P
- causes your script to be run through the C preprocessor before
- compilation by
- .IR perl .
- (Since both comments and cpp directives begin with the # character,
- you should avoid starting comments with any words recognized
- by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
- .TP 5
- .B \-s
- enables some rudimentary switch parsing for switches on the command line
- after the script name but before any filename arguments (or before a \-\|\-).
- Any switch found there is removed from @ARGV and sets the corresponding variable in the
- .I perl
- script.
- The following script prints \*(L"true\*(R" if and only if the script is
- invoked with a \-xyz switch.
- .nf
-
- .ne 2
- #!/usr/bin/perl \-s
- if ($xyz) { print "true\en"; }
-
- .fi
- .TP 5
- .B \-S
- makes
- .I perl
- use the PATH environment variable to search for the script
- (unless the name of the script starts with a slash).
- Typically this is used to emulate #! startup on machines that don't
- support #!, in the following manner:
- .nf
-
- #!/usr/bin/perl
- eval "exec /usr/bin/perl \-S $0 $*"
- if $running_under_some_shell;
-
- .fi
- The system ignores the first line and feeds the script to /bin/sh,
- which proceeds to try to execute the
- .I perl
- script as a shell script.
- The shell executes the second line as a normal shell command, and thus
- starts up the
- .I perl
- interpreter.
- On some systems $0 doesn't always contain the full pathname,
- so the
- .B \-S
- tells
- .I perl
- to search for the script if necessary.
- After
- .I perl
- locates the script, it parses the lines and ignores them because
- the variable $running_under_some_shell is never true.
- A better construct than $* would be ${1+"$@"}, which handles embedded spaces
- and such in the filenames, but doesn't work if the script is being interpreted
- by csh.
- In order to start up sh rather than csh, some systems may have to replace the
- #! line with a line containing just
- a colon, which will be politely ignored by perl.
- Other systems can't control that, and need a totally devious construct that
- will work under any of csh, sh or perl, such as the following:
- .nf
-
- .ne 3
- eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
- & eval 'exec /usr/bin/perl -S $0 $argv:q'
- if 0;
-
- .fi
- .TP 5
- .B \-u
- causes
- .I perl
- to dump core after compiling your script.
- You can then take this core dump and turn it into an executable file
- by using the undump program (not supplied).
- This speeds startup at the expense of some disk space (which you can
- minimize by stripping the executable).
- (Still, a "hello world" executable comes out to about 200K on my machine.)
- If you are going to run your executable as a set-id program then you
- should probably compile it using taintperl rather than normal perl.
- If you want to execute a portion of your script before dumping, use the
- dump operator instead.
- Note: availability of undump is platform specific and may not be available
- for a specific port of perl.
- .TP 5
- .B \-U
- allows
- .I perl
- to do unsafe operations.
- Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while
- running as superuser, and running setuid programs with fatal taint checks
- turned into warnings.
- .TP 5
- .B \-v
- prints the version and patchlevel of your
- .I perl
- executable.
- .TP 5
- .B \-w
- prints warnings about identifiers that are mentioned only once, and scalar
- variables that are used before being set.
- Also warns about redefined subroutines, and references to undefined
- filehandles or filehandles opened readonly that you are attempting to
- write on.
- Also warns you if you use == on values that don't look like numbers, and if
- your subroutines recurse more than 100 deep.
- .TP 5
- .BI \-x directory
- tells
- .I perl
- that the script is embedded in a message.
- Leading garbage will be discarded until the first line that starts
- with #! and contains the string "perl".
- Any meaningful switches on that line will be applied (but only one
- group of switches, as with normal #! processing).
- If a directory name is specified, Perl will switch to that directory
- before running the script.
- The
- .B \-x
- switch only controls the the disposal of leading garbage.
- The script must be terminated with _\|_END_\|_ if there is trailing garbage
- to be ignored (the script can process any or all of the trailing garbage
- via the DATA filehandle if desired).
- .Sh "Data Types and Objects"
- .PP
- .I Perl
- has three data types: scalars, arrays of scalars, and
- associative arrays of scalars.
- Normal arrays are indexed by number, and associative arrays by string.
- .PP
- The interpretation of operations and values in perl sometimes
- depends on the requirements
- of the context around the operation or value.
- There are three major contexts: string, numeric and array.
- Certain operations return array values
- in contexts wanting an array, and scalar values otherwise.
- (If this is true of an operation it will be mentioned in the documentation
- for that operation.)
- Operations which return scalars don't care whether the context is looking
- for a string or a number, but
- scalar variables and values are interpreted as strings or numbers
- as appropriate to the context.
- A scalar is interpreted as TRUE in the boolean sense if it is not the null
- string or 0.
- Booleans returned by operators are 1 for true and 0 or \'\' (the null
- string) for false.
- .PP
- There are actually two varieties of null string: defined and undefined.
- Undefined null strings are returned when there is no real value for something,
- such as when there was an error, or at end of file, or when you refer
- to an uninitialized variable or element of an array.
- An undefined null string may become defined the first time you access it, but
- prior to that you can use the defined() operator to determine whether the
- value is defined or not.
- .PP
- References to scalar variables always begin with \*(L'$\*(R', even when referring
- to a scalar that is part of an array.
- Thus:
- .nf
-
- .ne 3
- $days \h'|2i'# a simple scalar variable
- $days[28] \h'|2i'# 29th element of array @days
- $days{\'Feb\'}\h'|2i'# one value from an associative array
- $#days \h'|2i'# last index of array @days
-
- but entire arrays or array slices are denoted by \*(L'@\*(R':
-
- @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
- @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
- @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
-
- and entire associative arrays are denoted by \*(L'%\*(R':
-
- %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
- .fi
- .PP
- Any of these eight constructs may serve as an lvalue,
- that is, may be assigned to.
- (It also turns out that an assignment is itself an lvalue in
- certain contexts\*(--see examples under s, tr and chop.)
- Assignment to a scalar evaluates the righthand side in a scalar context,
- while assignment to an array or array slice evaluates the righthand side
- in an array context.
- .PP
- You may find the length of array @days by evaluating
- \*(L"$#days\*(R", as in
- .IR csh .
- (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
- Assigning to $#days changes the length of the array.
- Shortening an array by this method does not actually destroy any values.
- Lengthening an array that was previously shortened recovers the values that
- were in those elements.
- You can also gain some measure of efficiency by preextending an array that
- is going to get big.
- (You can also extend an array by assigning to an element that is off the
- end of the array.
- This differs from assigning to $#whatever in that intervening values
- are set to null rather than recovered.)
- You can truncate an array down to nothing by assigning the null list () to
- it.
- The following are exactly equivalent
- .nf
-
- @whatever = ();
- $#whatever = $[ \- 1;
-
- .fi
- .PP
- If you evaluate an array in a scalar context, it returns the length of
- the array.
- The following is always true:
- .nf
-
- scalar(@whatever) == $#whatever \- $[ + 1;
-
- .fi
- If you evaluate an associative array in a scalar context, it returns
- a value which is true if and only if the array contains any elements.
- (If there are any elements, the value returned is a string consisting
- of the number of used buckets and the number of allocated buckets, separated
- by a slash.)
- .PP
- Multi-dimensional arrays are not directly supported, but see the discussion
- of the $; variable later for a means of emulating multiple subscripts with
- an associative array.
- You could also write a subroutine to turn multiple subscripts into a single
- subscript.
- .PP
- Every data type has its own namespace.
- You can, without fear of conflict, use the same name for a scalar variable,
- an array, an associative array, a filehandle, a subroutine name, and/or
- a label.
- Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
- or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
- with respect to variable names.
- (They ARE reserved with respect to labels and filehandles, however, which
- don't have an initial special character.
- Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
- Using uppercase filehandles also improves readability and protects you
- from conflict with future reserved words.)
- Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
- different names.
- Names which start with a letter may also contain digits and underscores.
- Names which do not start with a letter are limited to one character,
- e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
- (Most of the one character names have a predefined significance to
- .IR perl .
- More later.)
- .PP
- Numeric literals are specified in any of the usual floating point or
- integer formats:
- .nf
-
- .ne 5
- 12345
- 12345.67
- .23E-10
- 0xffff # hex
- 0377 # octal
-
- .fi
- String literals are delimited by either single or double quotes.
- They work much like shell quotes:
- double-quoted string literals are subject to backslash and variable
- substitution; single-quoted strings are not (except for \e\' and \e\e).
- The usual backslash rules apply for making characters such as newline, tab,
- etc., as well as some more exotic forms:
- .nf
-
- \et tab
- \en newline
- \er return
- \ef form feed
- \eb backspace
- \ea alarm (bell)
- \ee escape
- \e033 octal char
- \ex1b hex char
- \ec[ control char
- \el lowercase next char
- \eu uppercase next char
- \eL lowercase till \eE
- \eU uppercase till \eE
- \eE end case modification
-
- .fi
- You can also embed newlines directly in your strings, i.e. they can end on
- a different line than they begin.
- This is nice, but if you forget your trailing quote, the error will not be
- reported until
- .I perl
- finds another line containing the quote character, which
- may be much further on in the script.
- Variable substitution inside strings is limited to scalar variables, normal
- array values, and array slices.
- (In other words, identifiers beginning with $ or @, followed by an optional
- bracketed expression as a subscript.)
- The following code segment prints out \*(L"The price is $100.\*(R"
- .nf
-
- .ne 2
- $Price = \'$100\';\h'|3.5i'# not interpreted
- print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
-
- .fi
- Note that you can put curly brackets around the identifier to delimit it
- from following alphanumerics.
- Also note that a single quoted string must be separated from a preceding
- word by a space, since single quote is a valid character in an identifier
- (see Packages).
- .PP
- Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current
- line number and filename at that point in your program.
- They may only be used as separate tokens; they will not be interpolated
- into strings.
- In addition, the token _\|_END_\|_ may be used to indicate the logical end of the
- script before the actual end of file.
- Any following text is ignored (but may be read via the DATA filehandle).
- The two control characters ^D and ^Z are synonyms for _\|_END_\|_.
- .PP
- A word that doesn't have any other interpretation in the grammar will be
- treated as if it had single quotes around it.
- For this purpose, a word consists only of alphanumeric characters and underline,
- and must start with an alphabetic character.
- As with filehandles and labels, a bare word that consists entirely of
- lowercase letters risks conflict with future reserved words, and if you
- use the
- .B \-w
- switch, Perl will warn you about any such words.
- .PP
- Array values are interpolated into double-quoted strings by joining all the
- elements of the array with the delimiter specified in the $" variable,
- space by default.
- (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
- in double-quoted strings, the interpolation of @array, $array[EXPR],
- @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
- referenced elsewhere in the program or is predefined.)
- The following are equivalent:
- .nf
-
- .ne 4
- $temp = join($",@ARGV);
- system "echo $temp";
-
- system "echo @ARGV";
-
- .fi
- Within search patterns (which also undergo double-quotish substitution)
- there is a bad ambiguity: Is /$foo[bar]/ to be
- interpreted as /${foo}[bar]/ (where [bar] is a character class for the
- regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
- array @foo)?
- If @foo doesn't otherwise exist, then it's obviously a character class.
- If @foo exists, perl takes a good guess about [bar], and is almost always right.
- If it does guess wrong, or if you're just plain paranoid,
- you can force the correct interpretation with curly brackets as above.
- .PP
- A line-oriented form of quoting is based on the shell here-is syntax.
- Following a << you specify a string to terminate the quoted material, and all lines
- following the current line down to the terminating string are the value
- of the item.
- The terminating string may be either an identifier (a word), or some
- quoted text.
- If quoted, the type of quotes you use determines the treatment of the text,
- just as in regular quoting.
- An unquoted identifier works like double quotes.
- There must be no space between the << and the identifier.
- (If you put a space it will be treated as a null identifier, which is
- valid, and matches the first blank line\*(--see Merry Christmas example below.)
- The terminating string must appear by itself (unquoted and with no surrounding
- whitespace) on the terminating line.
- .nf
-
- print <<EOF; # same as above
- The price is $Price.
- EOF
-
- print <<"EOF"; # same as above
- The price is $Price.
- EOF
-
- print << x 10; # null identifier is delimiter
- Merry Christmas!
-
- print <<`EOC`; # execute commands
- echo hi there
- echo lo there
- EOC
-
- print <<foo, <<bar; # you can stack them
- I said foo.
- foo
- I said bar.
- bar
-
- .fi
- Array literals are denoted by separating individual values by commas, and
- enclosing the list in parentheses:
- .nf
-
- (LIST)
-
- .fi
- In a context not requiring an array value, the value of the array literal
- is the value of the final element, as in the C comma operator.
- For example,
- .nf
-
- .ne 4
- @foo = (\'cc\', \'\-E\', $bar);
-
- assigns the entire array value to array foo, but
-
- $foo = (\'cc\', \'\-E\', $bar);
-
- .fi
- assigns the value of variable bar to variable foo.
- Note that the value of an actual array in a scalar context is the length
- of the array; the following assigns to $foo the value 3:
- .nf
-
- .ne 2
- @foo = (\'cc\', \'\-E\', $bar);
- $foo = @foo; # $foo gets 3
-
- .fi
- You may have an optional comma before the closing parenthesis of an
- array literal, so that you can say:
- .nf
-
- @foo = (
- 1,
- 2,
- 3,
- );
-
- .fi
- When a LIST is evaluated, each element of the list is evaluated in
- an array context, and the resulting array value is interpolated into LIST
- just as if each individual element were a member of LIST. Thus arrays
- lose their identity in a LIST\*(--the list
-
- (@foo,@bar,&SomeSub)
-
- contains all the elements of @foo followed by all the elements of @bar,
- followed by all the elements returned by the subroutine named SomeSub.
- .PP
- A list value may also be subscripted like a normal array.
- Examples:
- .nf
-
- $time = (stat($file))[8]; # stat returns array value
- $digit = ('a','b','c','d','e','f')[$digit-10];
- return (pop(@foo),pop(@foo))[0];
-
- .fi
- .PP
- Array lists may be assigned to if and only if each element of the list
- is an lvalue:
- .nf
-
- ($a, $b, $c) = (1, 2, 3);
-
- ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
-
- The final element may be an array or an associative array:
-
- ($a, $b, @rest) = split;
- local($a, $b, %rest) = @_;
-
- .fi
- You can actually put an array anywhere in the list, but the first array
- in the list will soak up all the values, and anything after it will get
- a null value.
- This may be useful in a local().
- .PP
- An associative array literal contains pairs of values to be interpreted
- as a key and a value:
- .nf
-
- .ne 2
- # same as map assignment above
- %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
-
- .fi
- Array assignment in a scalar context returns the number of elements
- produced by the expression on the right side of the assignment:
- .nf
-
- $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
-
- .fi
- .PP
- There are several other pseudo-literals that you should know about.
- If a string is enclosed by backticks (grave accents), it first undergoes
- variable substitution just like a double quoted string.
- It is then interpreted as a command, and the output of that command
- is the value of the pseudo-literal, like in a shell.
- In a scalar context, a single string consisting of all the output is
- returned.
- In an array context, an array of values is returned, one for each line
- of output.
- (You can set $/ to use a different line terminator.)
- The command is executed each time the pseudo-literal is evaluated.
- The status value of the command is returned in $? (see Predefined Names
- for the interpretation of $?).
- Unlike in \f2csh\f1, no translation is done on the return
- data\*(--newlines remain newlines.
- Unlike in any of the shells, single quotes do not hide variable names
- in the command from interpretation.
- To pass a $ through to the shell you need to hide it with a backslash.
- .PP
- Evaluating a filehandle in angle brackets yields the next line
- from that file (newline included, so it's never false until EOF, at
- which time an undefined value is returned).
- Ordinarily you must assign that value to a variable,
- but there is one situation where an automatic assignment happens.
- If (and only if) the input symbol is the only thing inside the conditional of a
- .I while
- loop, the value is
- automatically assigned to the variable \*(L"$_\*(R".
- (This may seem like an odd thing to you, but you'll use the construct
- in almost every
- .I perl
- script you write.)
- Anyway, the following lines are equivalent to each other:
- .nf
-
- .ne 5
- while ($_ = <STDIN>) { print; }
- while (<STDIN>) { print; }
- for (\|;\|<STDIN>;\|) { print; }
- print while $_ = <STDIN>;
- print while <STDIN>;
-
- .fi
- The filehandles
- .IR STDIN ,
- .I STDOUT
- and
- .I STDERR
- are predefined.
- (The filehandles
- .IR stdin ,
- .I stdout
- and
- .I stderr
- will also work except in packages, where they would be interpreted as
- local identifiers rather than global.)
- Additional filehandles may be created with the
- .I open
- function.
- .PP
- If a <FILEHANDLE> is used in a context that is looking for an array, an array
- consisting of all the input lines is returned, one line per array element.
- It's easy to make a LARGE data space this way, so use with care.
- .PP
- The null filehandle <> is special and can be used to emulate the behavior of
- \fIsed\fR and \fIawk\fR.
- Input from <> comes either from standard input, or from each file listed on
- the command line.
- Here's how it works: the first time <> is evaluated, the ARGV array is checked,
- and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
- input.
- The ARGV array is then processed as a list of filenames.
- The loop
- .nf
-
- .ne 3
- while (<>) {
- .\|.\|. # code for each line
- }
-
- .ne 10
- is equivalent to
-
- unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
- while ($ARGV = shift) {
- open(ARGV, $ARGV);
- while (<ARGV>) {
- .\|.\|. # code for each line
- }
- }
-
- .fi
- except that it isn't as cumbersome to say.
- It really does shift array ARGV and put the current filename into
- variable ARGV.
- It also uses filehandle ARGV internally.
- You can modify @ARGV before the first <> as long as you leave the first
- filename at the beginning of the array.
- Line numbers ($.) continue as if the input was one big happy file.
- (But see example under eof for how to reset line numbers on each file.)
- .PP
- .ne 5
- If you want to set @ARGV to your own list of files, go right ahead.
- If you want to pass switches into your script, you can
- put a loop on the front like this:
- .nf
-
- .ne 10
- while ($_ = $ARGV[0], /\|^\-/\|) {
- shift;
- last if /\|^\-\|\-$\|/\|;
- /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
- /\|^\-v\|/ \|&& \|$verbose++;
- .\|.\|. # other switches
- }
- while (<>) {
- .\|.\|. # code for each line
- }
-
- .fi
- The <> symbol will return FALSE only once.
- If you call it again after this it will assume you are processing another
- @ARGV list, and if you haven't set @ARGV, will input from
- .IR STDIN .
- .PP
- If the string inside the angle brackets is a reference to a scalar variable
- (e.g. <$foo>),
- then that variable contains the name of the filehandle to input from.
- .PP
- If the string inside angle brackets is not a filehandle, it is interpreted
- as a filename pattern to be globbed, and either an array of filenames or the
- next filename in the list is returned, depending on context.
- One level of $ interpretation is done first, but you can't say <$foo>
- because that's an indirect filehandle as explained in the previous
- paragraph.
- You could insert curly brackets to force interpretation as a
- filename glob: <${foo}>.
- Example:
- .nf
-
- .ne 3
- while (<*.c>) {
- chmod 0644, $_;
- }
-
- is equivalent to
-
- .ne 5
- open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
- while (<foo>) {
- chop;
- chmod 0644, $_;
- }
-
- .fi
- In fact, it's currently implemented that way.
- (Which means it will not work on filenames with spaces in them unless
- you have /bin/csh on your machine.)
- Of course, the shortest way to do the above is:
- .nf
-
- chmod 0644, <*.c>;
-
- .fi
- .Sh "Syntax"
- .PP
- A
- .I perl
- script consists of a sequence of declarations and commands.
- The only things that need to be declared in
- .I perl
- are report formats and subroutines.
- See the sections below for more information on those declarations.
- All uninitialized user-created objects are assumed to
- start with a null or 0 value until they
- are defined by some explicit operation such as assignment.
- The sequence of commands is executed just once, unlike in
- .I sed
- and
- .I awk
- scripts, where the sequence of commands is executed for each input line.
- While this means that you must explicitly loop over the lines of your input file
- (or files), it also means you have much more control over which files and which
- lines you look at.
- (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
- .B \-n
- or
- .B \-p
- switch.)
- .PP
- A declaration can be put anywhere a command can, but has no effect on the
- execution of the primary sequence of commands\*(--declarations all take effect
- at compile time.
- Typically all the declarations are put at the beginning or the end of the script.
- .PP
- .I Perl
- is, for the most part, a free-form language.
- (The only exception to this is format declarations, for fairly obvious reasons.)
- Comments are indicated by the # character, and extend to the end of the line.
- If you attempt to use /* */ C comments, it will be interpreted either as
- division or pattern matching, depending on the context.
- So don't do that.
- .Sh "Compound statements"
- In
- .IR perl ,
- a sequence of commands may be treated as one command by enclosing it
- in curly brackets.
- We will call this a BLOCK.
- .PP
- The following compound commands may be used to control flow:
- .nf
-
- .ne 4
- if (EXPR) BLOCK
- if (EXPR) BLOCK else BLOCK
- if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
- LABEL while (EXPR) BLOCK
- LABEL while (EXPR) BLOCK continue BLOCK
- LABEL for (EXPR; EXPR; EXPR) BLOCK
- LABEL foreach VAR (ARRAY) BLOCK
- LABEL BLOCK continue BLOCK
-
- .fi
- Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
- statements.
- This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
- If you want to write conditionals without curly brackets there are several
- other ways to do it.
- The following all do the same thing:
- .nf
-
- .ne 5
- if (!open(foo)) { die "Can't open $foo: $!"; }
- die "Can't open $foo: $!" unless open(foo);
- open(foo) || die "Can't open $foo: $!"; # foo or bust!
- open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
- # a bit exotic, that last one
-
- .fi
- .PP
- The
- .I if
- statement is straightforward.
- Since BLOCKs are always bounded by curly brackets, there is never any
- ambiguity about which
- .I if
- an
- .I else
- goes with.
- If you use
- .I unless
- in place of
- .IR if ,
- the sense of the test is reversed.
- .PP
- The
- .I while
- statement executes the block as long as the expression is true
- (does not evaluate to the null string or 0).
- The LABEL is optional, and if present, consists of an identifier followed by
- a colon.
- The LABEL identifies the loop for the loop control statements
- .IR next ,
- .IR last ,
- and
- .I redo
- (see below).
- If there is a
- .I continue
- BLOCK, it is always executed just before
- the conditional is about to be evaluated again, similarly to the third part
- of a
- .I for
- loop in C.
- Thus it can be used to increment a loop variable, even when the loop has
- been continued via the
- .I next
- statement (similar to the C \*(L"continue\*(R" statement).
- .PP
- If the word
- .I while
- is replaced by the word
- .IR until ,
- the sense of the test is reversed, but the conditional is still tested before
- the first iteration.
- .PP
- In either the
- .I if
- or the
- .I while
- statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
- is true if the value of the last command in that block is true.
- .PP
- The
- .I for
- loop works exactly like the corresponding
- .I while
- loop:
- .nf
-
- .ne 12
- for ($i = 1; $i < 10; $i++) {
- .\|.\|.
- }
-
- is the same as
-
- $i = 1;
- while ($i < 10) {
- .\|.\|.
- } continue {
- $i++;
- }
- .fi
- .PP
- The foreach loop iterates over a normal array value and sets the variable
- VAR to be each element of the array in turn.
- The variable is implicitly local to the loop, and regains its former value
- upon exiting the loop.
- The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
- so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
- If VAR is omitted, $_ is set to each value.
- If ARRAY is an actual array (as opposed to an expression returning an array
- value), you can modify each element of the array
- by modifying VAR inside the loop.
- Examples:
- .nf
-
- .ne 5
- for (@ary) { s/foo/bar/; }
-
- foreach $elem (@elements) {
- $elem *= 2;
- }
-
- .ne 3
- for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
- print $_, "\en"; sleep(1);
- }
-
- for (1..15) { print "Merry Christmas\en"; }
-
- .ne 3
- foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
- print "Item: $item\en";
- }
-
- .fi
- .PP
- The BLOCK by itself (labeled or not) is equivalent to a loop that executes
- once.
- Thus you can use any of the loop control statements in it to leave or
- restart the block.
- The
- .I continue
- block is optional.
- This construct is particularly nice for doing case structures.
- .nf
-
- .ne 6
- foo: {
- if (/^abc/) { $abc = 1; last foo; }
- if (/^def/) { $def = 1; last foo; }
- if (/^xyz/) { $xyz = 1; last foo; }
- $nothing = 1;
- }
-
- .fi
- There is no official switch statement in perl, because there
- are already several ways to write the equivalent.
- In addition to the above, you could write
- .nf
-
- .ne 6
- foo: {
- $abc = 1, last foo if /^abc/;
- $def = 1, last foo if /^def/;
- $xyz = 1, last foo if /^xyz/;
- $nothing = 1;
- }
-
- or
-
- .ne 6
- foo: {
- /^abc/ && do { $abc = 1; last foo; };
- /^def/ && do { $def = 1; last foo; };
- /^xyz/ && do { $xyz = 1; last foo; };
- $nothing = 1;
- }
-
- or
-
- .ne 6
- foo: {
- /^abc/ && ($abc = 1, last foo);
- /^def/ && ($def = 1, last foo);
- /^xyz/ && ($xyz = 1, last foo);
- $nothing = 1;
- }
-
- or even
-
- .ne 8
- if (/^abc/)
- { $abc = 1; }
- elsif (/^def/)
- { $def = 1; }
- elsif (/^xyz/)
- { $xyz = 1; }
- else
- {$nothing = 1;}
-
- .fi
- As it happens, these are all optimized internally to a switch structure,
- so perl jumps directly to the desired statement, and you needn't worry
- about perl executing a lot of unnecessary statements when you have a string
- of 50 elsifs, as long as you are testing the same simple scalar variable
- using ==, eq, or pattern matching as above.
- (If you're curious as to whether the optimizer has done this for a particular
- case statement, you can use the \-D1024 switch to list the syntax tree
- before execution.)
- .Sh "Simple statements"
- The only kind of simple statement is an expression evaluated for its side
- effects.
- Every expression (simple statement) must be terminated with a semicolon.
- Note that this is like C, but unlike Pascal (and
- .IR awk ).
- .PP
- Any simple statement may optionally be followed by a
- single modifier, just before the terminating semicolon.
- The possible modifiers are:
- .nf
-
- .ne 4
- if EXPR
- unless EXPR
- while EXPR
- until EXPR
-
- .fi
- The
- .I if
- and
- .I unless
- modifiers have the expected semantics.
- The
- .I while
- and
- .I until
- modifiers also have the expected semantics (conditional evaluated first),
- except when applied to a do-BLOCK or a do-SUBROUTINE command,
- in which case the block executes once before the conditional is evaluated.
- This is so that you can write loops like:
- .nf
-
- .ne 4
- do {
- $_ = <STDIN>;
- .\|.\|.
- } until $_ \|eq \|".\|\e\|n";
-
- .fi
- (See the
- .I do
- operator below. Note also that the loop control commands described later will
- NOT work in this construct, since modifiers don't take loop labels.
- Sorry.)
- .Sh "Expressions"
- Since
- .I perl
- expressions work almost exactly like C expressions, only the differences
- will be mentioned here.
- .PP
- Here's what
- .I perl
- has that C doesn't:
- .Ip ** 8 2
- The exponentiation operator.
- .Ip **= 8
- The exponentiation assignment operator.
- .Ip (\|) 8 3
- The null list, used to initialize an array to null.
- .Ip . 8
- Concatenation of two strings.
- .Ip .= 8
- The concatenation assignment operator.
- .Ip eq 8
- String equality (== is numeric equality).
- For a mnemonic just think of \*(L"eq\*(R" as a string.
- (If you are used to the
- .I awk
- behavior of using == for either string or numeric equality
- based on the current form of the comparands, beware!
- You must be explicit here.)
- .Ip ne 8
- String inequality (!= is numeric inequality).
- .Ip lt 8
- String less than.
- .Ip gt 8
- String greater than.
- .Ip le 8
- String less than or equal.
- .Ip ge 8
- String greater than or equal.
- .Ip cmp 8
- String comparison, returning -1, 0, or 1.
- .Ip <=> 8
- Numeric comparison, returning -1, 0, or 1.
- .Ip =~ 8 2
- Certain operations search or modify the string \*(L"$_\*(R" by default.
- This operator makes that kind of operation work on some other string.
- The right argument is a search pattern, substitution, or translation.
- The left argument is what is supposed to be searched, substituted, or
- translated instead of the default \*(L"$_\*(R".
- The return value indicates the success of the operation.
- (If the right argument is an expression other than a search pattern,
- substitution, or translation, it is interpreted as a search pattern
- at run time.
- This is less efficient than an explicit search, since the pattern must
- be compiled every time the expression is evaluated.)
- The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
- .Ip !~ 8
- Just like =~ except the return value is negated.
- .Ip x 8
- The repetition operator.
- Returns a string consisting of the left operand repeated the
- number of times specified by the right operand.
- In an array context, if the left operand is a list in parens, it repeats
- the list.
- .nf
-
- print \'\-\' x 80; # print row of dashes
- print \'\-\' x80; # illegal, x80 is identifier
-
- print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
-
- @ones = (1) x 80; # an array of 80 1's
- @ones = (5) x @ones; # set all elements to 5
-
- .fi
- .Ip x= 8
- The repetition assignment operator.
- Only works on scalars.
- .Ip .\|. 8
- The range operator, which is really two different operators depending
- on the context.
- In an array context, returns an array of values counting (by ones)
- from the left value to the right value.
- This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
- slice operations on arrays.
- .Sp
- In a scalar context, .\|. returns a boolean value.
- The operator is bistable, like a flip-flop..
- Each .\|. operator maintains its own boolean state.
- It is false as long as its left operand is false.
- Once the left operand is true, the range operator stays true
- until the right operand is true,
- AFTER which the range operator becomes false again.
- (It doesn't become false till the next time the range operator is evaluated.
- It can become false on the same evaluation it became true, but it still returns
- true once.)
- The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
- and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
- The scalar .\|. operator is primarily intended for doing line number ranges
- after
- the fashion of \fIsed\fR or \fIawk\fR.
- The precedence is a little lower than || and &&.
- The value returned is either the null string for false, or a sequence number
- (beginning with 1) for true.
- The sequence number is reset for each range encountered.
- The final sequence number in a range has the string \'E0\' appended to it, which
- doesn't affect its numeric value, but gives you something to search for if you
- want to exclude the endpoint.
- You can exclude the beginning point by waiting for the sequence number to be
- greater than 1.
- If either operand of scalar .\|. is static, that operand is implicitly compared
- to the $. variable, the current line number.
- Examples:
- .nf
-
- .ne 6
- As a scalar operator:
- if (101 .\|. 200) { print; } # print 2nd hundred lines
-
- next line if (1 .\|. /^$/); # skip header lines
-
- s/^/> / if (/^$/ .\|. eof()); # quote body
-
- .ne 4
- As an array operator:
- for (101 .\|. 200) { print; } # print $_ 100 times
-
- @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
- @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
-
- .fi
- .Ip \-x 8
- A file test.
- This unary operator takes one argument, either a filename or a filehandle,
- and tests the associated file to see if something is true about it.
- If the argument is omitted, tests $_, except for \-t, which tests
- .IR STDIN .
- It returns 1 for true and \'\' for false, or the undefined value if the
- file doesn't exist.
- Precedence is higher than logical and relational operators, but lower than
- arithmetic operators.
- The operator may be any of:
- .nf
- \-r File is readable by effective uid.
- \-w File is writable by effective uid.
- \-x File is executable by effective uid.
- \-o File is owned by effective uid.
- \-R File is readable by real uid.
- \-W File is writable by real uid.
- \-X File is executable by real uid.
- \-O File is owned by real uid.
- \-e File exists.
- \-z File has zero size.
- \-s File has non-zero size (returns size).
- \-f File is a plain file.
- \-d File is a directory.
- \-l File is a symbolic link.
- \-p File is a named pipe (FIFO).
- \-S File is a socket.
- \-b File is a block special file.
- \-c File is a character special file.
- \-u File has setuid bit set.
- \-g File has setgid bit set.
- \-k File has sticky bit set.
- \-t Filehandle is opened to a tty.
- \-T File is a text file.
- \-B File is a binary file (opposite of \-T).
- \-M Age of file in days when script started.
- \-A Same for access time.
- \-C Same for inode change time.
-
- .fi
- The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
- is based solely on the mode of the file and the uids and gids of the user.
- There may be other reasons you can't actually read, write or execute the file.
- Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
- \-x and \-X return 1 if any execute bit is set in the mode.
- Scripts run by the superuser may thus need to do a stat() in order to determine
- the actual mode of the file, or temporarily set the uid to something else.
- .Sp
- Example:
- .nf
- .ne 7
-
- while (<>) {
- chop;
- next unless \-f $_; # ignore specials
- .\|.\|.
- }
-
- .fi
- Note that \-s/a/b/ does not do a negated substitution.
- Saying \-exp($foo) still works as expected, however\*(--only single letters
- following a minus are interpreted as file tests.
- .Sp
- The \-T and \-B switches work as follows.
- The first block or so of the file is examined for odd characters such as
- strange control codes or metacharacters.
- If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
- Also, any file containing null in the first block is considered a binary file.
- If \-T or \-B is used on a filehandle, the current stdio buffer is examined
- rather than the first block.
- Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
- a filehandle.
- .PP
- If any of the file tests (or either stat operator) are given the special
- filehandle consisting of a solitary underline, then the stat structure
- of the previous file test (or stat operator) is used, saving a system
- call.
- (This doesn't work with \-t, and you need to remember that lstat and -l
- will leave values in the stat structure for the symbolic link, not the
- real file.)
- Example:
- .nf
-
- print "Can do.\en" if -r $a || -w _ || -x _;
-
- .ne 9
- stat($filename);
- print "Readable\en" if -r _;
- print "Writable\en" if -w _;
- print "Executable\en" if -x _;
- print "Setuid\en" if -u _;
- print "Setgid\en" if -g _;
- print "Sticky\en" if -k _;
- print "Text\en" if -T _;
- print "Binary\en" if -B _;
-
- .fi
- .PP
- Here is what C has that
- .I perl
- doesn't:
- .Ip "unary &" 12
- Address-of operator.
- .Ip "unary *" 12
- Dereference-address operator.
- .Ip "(TYPE)" 12
- Type casting operator.
- .PP
- Like C,
- .I perl
- does a certain amount of expression evaluation at compile time, whenever
- it determines that all of the arguments to an operator are static and have
- no side effects.
- In particular, string concatenation happens at compile time between literals that don't do variable substitution.
- Backslash interpretation also happens at compile time.
- You can say
- .nf
-
- .ne 2
- \'Now is the time for all\' . "\|\e\|n" .
- \'good men to come to.\'
-
- .fi
- and this all reduces to one string internally.
- .PP
- The autoincrement operator has a little extra built-in magic to it.
- If you increment a variable that is numeric, or that has ever been used in
- a numeric context, you get a normal increment.
- If, however, the variable has only been used in string contexts since it
- was set, and has a value that is not null and matches the
- pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
- as a string, preserving each character within its range, with carry:
- .nf
-
- print ++($foo = \'99\'); # prints \*(L'100\*(R'
- print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
- print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
- print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
-
- .fi
- The autodecrement is not magical.
- .PP
- The range operator (in an array context) makes use of the magical
- autoincrement algorithm if the minimum and maximum are strings.
- You can say
-
- @alphabet = (\'A\' .. \'Z\');
-
- to get all the letters of the alphabet, or
-
- $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
-
- to get a hexadecimal digit, or
-
- @z2 = (\'01\' .. \'31\'); print @z2[$mday];
-
- to get dates with leading zeros.
- (If the final value specified is not in the sequence that the magical increment
- would produce, the sequence goes until the next value would be longer than
- the final value specified.)
- .PP
- The || and && operators differ from C's in that, rather than returning 0 or 1,
- they return the last value evaluated.
- Thus, a portable way to find out the home directory might be:
- .nf
-
- $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
- (getpwuid($<))[7] || die "You're homeless!\en";
-
- .fi
- .PP
- Along with the literals and variables mentioned earlier,
- the operations in the following section can serve as terms in an expression.
- Some of these operations take a LIST as an argument.
- Such a list can consist of any combination of scalar arguments or array values;
- the array values will be included in the list as if each individual element were
- interpolated at that point in the list, forming a longer single-dimensional
- array value.
- Elements of the LIST should be separated by commas.
- If an operation is listed both with and without parentheses around its
- arguments, it means you can either use it as a unary operator or
- as a function call.
- To use it as a function call, the next token on the same line must
- be a left parenthesis.
- (There may be intervening white space.)
- Such a function then has highest precedence, as you would expect from
- a function.
- If any token other than a left parenthesis follows, then it is a
- unary operator, with a precedence depending only on whether it is a LIST
- operator or not.
- LIST operators have lowest precedence.
- All other unary operators have a precedence greater than relational operators
- but less than arithmetic operators.
- See the section on Precedence.
- .Ip "/PATTERN/" 8 4
- See m/PATTERN/.
- .Ip "?PATTERN?" 8 4
- This is just like the /pattern/ search, except that it matches only once between
- calls to the
- .I reset
- operator.
- This is a useful optimization when you only want to see the first occurrence of
- something in each file of a set of files, for instance.
- Only ?? patterns local to the current package are reset.
- .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
- Does the same thing that the accept system call does.
- Returns true if it succeeded, false otherwise.
- See example in section on Interprocess Communication.
- .Ip "alarm(SECONDS)" 8 4
- .Ip "alarm SECONDS" 8
- Arranges to have a SIGALRM delivered to this process after the specified number
- of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause
- a SIGALRM at some point more than 14 seconds in the future.
- Only one timer may be counting at once. Each call disables the previous
- timer, and an argument of 0 may be supplied to cancel the previous timer
- without starting a new one.
- The returned value is the amount of time remaining on the previous timer.
- .Ip "atan2(Y,X)" 8 2
- Returns the arctangent of Y/X in the range
- .if t \-\(*p to \(*p.
- .if n \-PI to PI.
- .Ip "bind(SOCKET,NAME)" 8 2
- Does the same thing that the bind system call does.
- Returns true if it succeeded, false otherwise.
- NAME should be a packed address of the proper type for the socket.
- See example in section on Interprocess Communication.
- .Ip "binmode(FILEHANDLE)" 8 4
- .Ip "binmode FILEHANDLE" 8 4
- Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
- that distinguish between binary and text files.
- Files that are not read in binary mode have CR LF sequences translated
- to LF on input and LF translated to CR LF on output.
- Binmode has no effect under Unix.
- If FILEHANDLE is an expression, the value is taken as the name of
- the filehandle.
- .Ip "caller(EXPR)"
- .Ip "caller"
- Returns the context of the current subroutine call:
- .nf
-
- ($package,$filename,$line) = caller;
-
- .fi
- With EXPR, returns some extra information that the debugger uses to print
- a stack trace. The value of EXPR indicates how many call frames to go
- back before the current one.
- .Ip "chdir(EXPR)" 8 2
- .Ip "chdir EXPR" 8 2
- Changes the working directory to EXPR, if possible.
- If EXPR is omitted, changes to home directory.
- Returns 1 upon success, 0 otherwise.
- See example under
- .IR die .
- .Ip "chmod(LIST)" 8 2
- .Ip "chmod LIST" 8 2
- Changes the permissions of a list of files.
- The first element of the list must be the numerical mode.
- Returns the number of files successfully changed.
- .nf
-
- .ne 2
- $cnt = chmod 0755, \'foo\', \'bar\';
- chmod 0755, @executables;
-
- .fi
- .Ip "chop(LIST)" 8 7
- .Ip "chop(VARIABLE)" 8
- .Ip "chop VARIABLE" 8
- .Ip "chop" 8
- Chops off the last character of a string and returns the character chopped.
- It's used primarily to remove the newline from the end of an input record,
- but is much more efficient than s/\en// because it neither scans nor copies
- the string.
- If VARIABLE is omitted, chops $_.
- Example:
- .nf
-
- .ne 5
- while (<>) {
- chop; # avoid \en on last field
- @array = split(/:/);
- .\|.\|.
- }
-
- .fi
- You can actually chop anything that's an lvalue, including an assignment:
- .nf
-
- chop($cwd = \`pwd\`);
- chop($answer = <STDIN>);
-
- .fi
- If you chop a list, each element is chopped.
- Only the value of the last chop is returned.
- .Ip "chown(LIST)" 8 2
- .Ip "chown LIST" 8 2
- Changes the owner (and group) of a list of files.
- The first two elements of the list must be the NUMERICAL uid and gid,
- in that order.
- Returns the number of files successfully changed.
- .nf
-
- .ne 2
- $cnt = chown $uid, $gid, \'foo\', \'bar\';
- chown $uid, $gid, @filenames;
-
- .fi
- .ne 23
- Here's an example that looks up non-numeric uids in the passwd file:
- .nf
-
- print "User: ";
- $user = <STDIN>;
- chop($user);
- print "Files: "
- $pattern = <STDIN>;
- chop($pattern);
- .ie t \{\
- open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
- 'br\}
- .el \{\
- open(pass, \'/etc/passwd\')
- || die "Can't open passwd: $!\en";
- 'br\}
- while (<pass>) {
- ($login,$pass,$uid,$gid) = split(/:/);
- $uid{$login} = $uid;
- $gid{$login} = $gid;
- }
- @ary = <${pattern}>; # get filenames
- if ($uid{$user} eq \'\') {
- die "$user not in passwd file";
- }
- else {
- chown $uid{$user}, $gid{$user}, @ary;
- }
-
- .fi
- .Ip "chroot(FILENAME)" 8 5
- .Ip "chroot FILENAME" 8
- Does the same as the system call of that name.
- If you don't know what it does, don't worry about it.
- If FILENAME is omitted, does chroot to $_.
- .Ip "close(FILEHANDLE)" 8 5
- .Ip "close FILEHANDLE" 8
- Closes the file or pipe associated with the file handle.
- You don't have to close FILEHANDLE if you are immediately going to
- do another open on it, since open will close it for you.
- (See
- .IR open .)
- However, an explicit close on an input file resets the line counter ($.), while
- the implicit close done by
- .I open
- does not.
- Also, closing a pipe will wait for the process executing on the pipe to complete,
- in case you want to look at the output of the pipe afterwards.
- Closing a pipe explicitly also puts the status value of the command into $?.
- Example:
- .nf
-
- .ne 4
- open(OUTPUT, \'|sort >foo\'); # pipe to sort
- .\|.\|. # print stuff to output
- close OUTPUT; # wait for sort to finish
- open(INPUT, \'foo\'); # get sort's results
-
- .fi
- FILEHANDLE may be an expression whose value gives the real filehandle name.
- .Ip "closedir(DIRHANDLE)" 8 5
- .Ip "closedir DIRHANDLE" 8
- Closes a directory opened by opendir().
- .Ip "connect(SOCKET,NAME)" 8 2
- Does the same thing that the connect system call does.
- Returns true if it succeeded, false otherwise.
- NAME should be a package address of the proper type for the socket.
- See example in section on Interprocess Communication.
- .Ip "cos(EXPR)" 8 6
- .Ip "cos EXPR" 8 6
- Returns the cosine of EXPR (expressed in radians).
- If EXPR is omitted takes cosine of $_.
- .Ip "crypt(PLAINTEXT,SALT)" 8 6
- Encrypts a string exactly like the crypt() function in the C library.
- Useful for checking the password file for lousy passwords.
- Only the guys wearing white hats should do this.
- .Ip "dbmclose(ASSOC_ARRAY)" 8 6
- .Ip "dbmclose ASSOC_ARRAY" 8
- Breaks the binding between a dbm file and an associative array.
- The values remaining in the associative array are meaningless unless
- you happen to want to know what was in the cache for the dbm file.
- This function is only useful if you have ndbm.
- .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
- This binds a dbm or ndbm file to an associative array.
- ASSOC is the name of the associative array.
- (Unlike normal open, the first argument is NOT a filehandle, even though
- it looks like one).
- DBNAME is the name of the database (without the .dir or .pag extension).
- If the database does not exist, it is created with protection specified
- by MODE (as modified by the umask).
- If your system only supports the older dbm functions, you may perform only one
- dbmopen in your program.
- If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
- error.
- .Sp
- Values assigned to the associative array prior to the dbmopen are lost.
- A certain number of values from the dbm file are cached in memory.
- By default this number is 64, but you can increase it by preallocating
- that number of garbage entries in the associative array before the dbmopen.
- You can flush the cache if necessary with the reset command.
- .Sp
- If you don't have write access to the dbm file, you can only read
- associative array variables, not set them.
- If you want to test whether you can write, either use file tests or
- try setting a dummy array entry inside an eval, which will trap the error.
- .Sp
- Note that functions such as keys() and values() may return huge array values
- when used on large dbm files.
- You may prefer to use the each() function to iterate over large dbm files.
- Example:
- .nf
-
- .ne 6
- # print out history file offsets
- dbmopen(HIST,'/usr/lib/news/history',0666);
- while (($key,$val) = each %HIST) {
- print $key, ' = ', unpack('L',$val), "\en";
- }
- dbmclose(HIST);
-
- .fi
- .Ip "defined(EXPR)" 8 6
- .Ip "defined EXPR" 8
- Returns a boolean value saying whether the lvalue EXPR has a real value
- or not.
- Many operations return the undefined value under exceptional conditions,
- such as end of file, uninitialized variable, system error and such.
- This function allows you to distinguish between an undefined null string
- and a defined null string with operations that might return a real null
- string, in particular referencing elements of an array.
- You may also check to see if arrays or subroutines exist.
- Use on predefined variables is not guaranteed to produce intuitive results.
- Examples:
- .nf
-
- .ne 7
- print if defined $switch{'D'};
- print "$val\en" while defined($val = pop(@ary));
- die "Can't readlink $sym: $!"
- unless defined($value = readlink $sym);
- eval '@foo = ()' if defined(@foo);
- die "No XYZ package defined" unless defined %_XYZ;
- sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
-
- .fi
- See also undef.
- .Ip "delete $ASSOC{KEY}" 8 6
- Deletes the specified value from the specified associative array.
- Returns the deleted value, or the undefined value if nothing was deleted.
- Deleting from $ENV{} modifies the environment.
- Deleting from an array bound to a dbm file deletes the entry from the dbm
- file.
- .Sp
- The following deletes all the values of an associative array:
- .nf
-
- .ne 3
- foreach $key (keys %ARRAY) {
- delete $ARRAY{$key};
- }
-
- .fi
- (But it would be faster to use the
- .I reset
- command.
- Saying undef %ARRAY is faster yet.)
- .Ip "die(LIST)" 8
- .Ip "die LIST" 8
- Outside of an eval, prints the value of LIST to
- .I STDERR
- and exits with the current value of $!
- (errno).
- If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
- If ($? >> 8) is 0, exits with 255.
- Inside an eval, the error message is stuffed into $@ and the eval is terminated
- with the undefined value.
- .Sp
- Equivalent examples:
- .nf
-
- .ne 3
- .ie t \{\
- die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
- 'br\}
- .el \{\
- die "Can't cd to spool: $!\en"
- unless chdir \'/usr/spool/news\';
- 'br\}
-
- chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
-
- .fi
- .Sp
- If the value of EXPR does not end in a newline, the current script line
- number and input line number (if any) are also printed, and a newline is
- supplied.
- Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
- better sense when the string \*(L"at foo line 123\*(R" is appended.
- Suppose you are running script \*(L"canasta\*(R".
- .nf
-
- .ne 7
- die "/etc/games is no good";
- die "/etc/games is no good, stopped";
-
- produce, respectively
-
- /etc/games is no good at canasta line 123.
- /etc/games is no good, stopped at canasta line 123.
-
- .fi
- See also
- .IR exit .
- .Ip "do BLOCK" 8 4
- Returns the value of the last command in the sequence of commands indicated
- by BLOCK.
- When modified by a loop modifier, executes the BLOCK once before testing the
- loop condition.
- (On other statements the loop modifiers test the conditional first.)
- .Ip "do SUBROUTINE (LIST)" 8 3
- Executes a SUBROUTINE declared by a
- .I sub
- declaration, and returns the value
- of the last expression evaluated in SUBROUTINE.
- If there is no subroutine by that name, produces a fatal error.
- (You may use the \*(L"defined\*(R" operator to determine if a subroutine
- exists.)
- If you pass arrays as part of LIST you may wish to pass the length
- of the array in front of each array.
- (See the section on subroutines later on.)
- The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
- form.
- .Sp
- SUBROUTINE may also be a single scalar variable, in which case
- the name of the subroutine to execute is taken from the variable.
- .Sp
- As an alternate (and preferred) form,
- you may call a subroutine by prefixing the name with
- an ampersand: &foo(@args).
- If you aren't passing any arguments, you don't have to use parentheses.
- If you omit the parentheses, no @_ array is passed to the subroutine.
- The & form is also used to specify subroutines to the defined and undef
- operators:
- .nf
-
- if (defined &$var) { &$var($parm); undef &$var; }
-
- .fi
- .Ip "do EXPR" 8 3
- Uses the value of EXPR as a filename and executes the contents of the file
- as a
- .I perl
- script.
- Its primary use is to include subroutines from a
- .I perl
- subroutine library.
- .nf
-
- do \'stat.pl\';
-
- is just like
-
- eval \`cat stat.pl\`;
-
- .fi
- except that it's more efficient, more concise, keeps track of the current
- filename for error messages, and searches all the
- .B \-I
- libraries if the file
- isn't in the current directory (see also the @INC array in Predefined Names).
- It's the same, however, in that it does reparse the file every time you
- call it, so if you are going to use the file inside a loop you might prefer
- to use \-P and #include, at the expense of a little more startup time.
- (The main problem with #include is that cpp doesn't grok # comments\*(--a
- workaround is to use \*(L";#\*(R" for standalone comments.)
- Note that the following are NOT equivalent:
- .nf
-
- .ne 2
- do $foo; # eval a file
- do $foo(); # call a subroutine
-
- .fi
- Note that inclusion of library routines is better done with
- the \*(L"require\*(R" operator.
- .Ip "dump LABEL" 8 6
- This causes an immediate core dump.
- Primarily this is so that you can use the undump program to turn your
- core dump into an executable binary after having initialized all your
- variables at the beginning of the program.
- When the new binary is executed it will begin by executing a "goto LABEL"
- (with all the restrictions that goto suffers).
- Think of it as a goto with an intervening core dump and reincarnation.
- If LABEL is omitted, restarts the program from the top.
- WARNING: any files opened at the time of the dump will NOT be open any more
- when the program is reincarnated, with possible resulting confusion on the part
- of perl.
- See also \-u.
- .Sp
- Example:
- .nf
-
- .ne 16
- #!/usr/bin/perl
- require 'getopt.pl';
- require 'stat.pl';
- %days = (
- 'Sun',1,
- 'Mon',2,
- 'Tue',3,
- 'Wed',4,
- 'Thu',5,
- 'Fri',6,
- 'Sat',7);
-
- dump QUICKSTART if $ARGV[0] eq '-d';
-
- QUICKSTART:
- do Getopt('f');
-
- .fi
- .Ip "each(ASSOC_ARRAY)" 8 6
- .Ip "each ASSOC_ARRAY" 8
- Returns a 2 element array consisting of the key and value for the next
- value of an associative array, so that you can iterate over it.
- Entries are returned in an apparently random order.
- When the array is entirely read, a null array is returned (which when
- assigned produces a FALSE (0) value).
- The next call to each() after that will start iterating again.
- The iterator can be reset only by reading all the elements from the array.
- You must not modify the array while iterating over it.
- There is a single iterator for each associative array, shared by all
- each(), keys() and values() function calls in the program.
- The following prints out your environment like the printenv program, only
- in a different order:
- .nf
-
- .ne 3
- while (($key,$value) = each %ENV) {
- print "$key=$value\en";
- }
-
- .fi
- See also keys() and values().
- .Ip "eof(FILEHANDLE)" 8 8
- .Ip "eof()" 8
- .Ip "eof" 8
- Returns 1 if the next read on FILEHANDLE will return end of file, or if
- FILEHANDLE is not open.
- FILEHANDLE may be an expression whose value gives the real filehandle name.
- (Note that this function actually reads a character and then ungetc's it,
- so it is not very useful in an interactive context.)
- An eof without an argument returns the eof status for the last file read.
- Empty parentheses () may be used to indicate the pseudo file formed of the
- files listed on the command line, i.e. eof() is reasonable to use inside
- a while (<>) loop to detect the end of only the last file.
- Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
- Examples:
- .nf
-
- .ne 7
- # insert dashes just before last line of last file
- while (<>) {
- if (eof()) {
- print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
- }
- print;
- }
-
- .ne 7
- # reset line numbering on each input file
- while (<>) {
- print "$.\et$_";
- if (eof) { # Not eof().
- close(ARGV);
- }
- }
-
- .fi
- .Ip "eval(EXPR)" 8 6
- .Ip "eval EXPR" 8 6
- .Ip "eval BLOCK" 8 6
- EXPR is parsed and executed as if it were a little
- .I perl
- program.
- It is executed in the context of the current
- .I perl
- program, so that
- any variable settings, subroutine or format definitions remain afterwards.
- The value returned is the value of the last expression evaluated, just
- as with subroutines.
- If there is a syntax error or runtime error, or a die statement is
- executed, an undefined value is returned by
- eval, and $@ is set to the error message.
- If there was no error, $@ is guaranteed to be a null string.
- If EXPR is omitted, evaluates $_.
- The final semicolon, if any, may be omitted from the expression.
- .Sp
- Note that, since eval traps otherwise-fatal errors, it is useful for
- determining whether a particular feature
- (such as dbmopen or symlink) is implemented.
- It is also Perl's exception trapping mechanism, where the die operator is
- used to raise exceptions.
- .Sp
- If the code to be executed doesn't vary, you may use
- the eval-BLOCK form to trap run-time errors without incurring
- the penalty of recompiling each time.
- The error, if any, is still returned in $@.
- Evaluating a single-quoted string (as EXPR) has the same effect, except that
- the eval-EXPR form reports syntax errors at run time via $@, whereas the
- eval-BLOCK form reports syntax errors at compile time. The eval-EXPR form
- is optimized to eval-BLOCK the first time it succeeds. (Since the replacement
- side of a substitution is considered a single-quoted string when you
- use the e modifier, the same optimization occurs there.) Examples:
- .nf
-
- .ne 11
- # make divide-by-zero non-fatal
- eval { $answer = $a / $b; }; warn $@ if $@;
-
- # optimized to same thing after first use
- eval '$answer = $a / $b'; warn $@ if $@;
-
- # a compile-time error
- eval { $answer = };
-
- # a run-time error
- eval '$answer ='; # sets $@
-
- .fi
- .Ip "exec(LIST)" 8 8
- .Ip "exec LIST" 8 6
- If there is more than one argument in LIST, or if LIST is an array with
- more than one value,
- calls execvp() with the arguments in LIST.
- If there is only one scalar argument, the argument is checked for shell metacharacters.
- If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
- If there are none, the argument is split into words and passed directly to
- execvp(), which is more efficient.
- Note: exec (and system) do not flush your output buffer, so you may need to
- set $| to avoid lost output.
- Examples:
- .nf
-
- exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
- exec "sort $outfile | uniq";
-
- .fi
- .Sp
- If you don't really want to execute the first argument, but want to lie
- to the program you are executing about its own name, you can specify
- the program you actually want to run by assigning that to a variable and
- putting the name of the variable in front of the LIST without a comma.
- (This always forces interpretation of the LIST as a multi-valued list, even
- if there is only a single scalar in the list.)
- Example:
- .nf
-
- .ne 2
- $shell = '/bin/csh';
- exec $shell '-sh'; # pretend it's a login shell
-
- .fi
- .Ip "exit(EXPR)" 8 6
- .Ip "exit EXPR" 8
- Evaluates EXPR and exits immediately with that value.
- Example:
- .nf
-
- .ne 2
- $ans = <STDIN>;
- exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
-
- .fi
- See also
- .IR die .
- If EXPR is omitted, exits with 0 status.
- .Ip "exp(EXPR)" 8 3
- .Ip "exp EXPR" 8
- Returns
- .I e
- to the power of EXPR.
- If EXPR is omitted, gives exp($_).
- .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
- Implements the fcntl(2) function.
- You'll probably have to say
- .nf
-
- require "fcntl.ph"; # probably /usr/local/lib/perl/fcntl.ph
-
- .fi
- first to get the correct function definitions.
- If fcntl.ph doesn't exist or doesn't have the correct definitions
- you'll have to roll
- your own, based on your C header files such as <sys/fcntl.h>.
- (There is a perl script called h2ph that comes with the perl kit
- which may help you in this.)
- Argument processing and value return works just like ioctl below.
- Note that fcntl will produce a fatal error if used on a machine that doesn't implement
- fcntl(2).
- .Ip "fileno(FILEHANDLE)" 8 4
- .Ip "fileno FILEHANDLE" 8 4
- Returns the file descriptor for a filehandle.
- Useful for constructing bitmaps for select().
- If FILEHANDLE is an expression, the value is taken as the name of
- the filehandle.
- .Ip "flock(FILEHANDLE,OPERATION)" 8 4
- Calls flock(2) on FILEHANDLE.
- See manual page for flock(2) for definition of OPERATION.
- Returns true for success, false on failure.
- Will produce a fatal error if used on a machine that doesn't implement
- flock(2).
- Here's a mailbox appender for BSD systems.
- .nf
-
- .ne 20
- $LOCK_SH = 1;
- $LOCK_EX = 2;
- $LOCK_NB = 4;
- $LOCK_UN = 8;
-
- sub lock {
- flock(MBOX,$LOCK_EX);
- # and, in case someone appended
- # while we were waiting...
- seek(MBOX, 0, 2);
- }
-
- sub unlock {
- flock(MBOX,$LOCK_UN);
- }
-
- open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
- || die "Can't open mailbox: $!";
-
- do lock();
- print MBOX $msg,"\en\en";
- do unlock();
-
- .fi
- .Ip "fork" 8 4
- Does a fork() call.
- Returns the child pid to the parent process and 0 to the child process.
- Note: unflushed buffers remain unflushed in both processes, which means
- you may need to set $| to avoid duplicate output.
- .Ip "getc(FILEHANDLE)" 8 4
- .Ip "getc FILEHANDLE" 8
- .Ip "getc" 8
- Returns the next character from the input file attached to FILEHANDLE, or
- a null string at EOF.
- If FILEHANDLE is omitted, reads from STDIN.
- .Ip "getlogin" 8 3
- Returns the current login from /etc/utmp, if any.
- If null, use getpwuid.
-
- $login = getlogin || (getpwuid($<))[0] || "Somebody";
-
- .Ip "getpeername(SOCKET)" 8 3
- Returns the packed sockaddr address of other end of the SOCKET connection.
- .nf
-
- .ne 4
- # An internet sockaddr
- $sockaddr = 'S n a4 x8';
- $hersockaddr = getpeername(S);
- .ie t \{\
- ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
- 'br\}
- .el \{\
- ($family, $port, $heraddr) =
- unpack($sockaddr,$hersockaddr);
- 'br\}
-
- .fi
- .Ip "getpgrp(PID)" 8 4
- .Ip "getpgrp PID" 8
- Returns the current process group for the specified PID, 0 for the current
- process.
- Will produce a fatal error if used on a machine that doesn't implement
- getpgrp(2).
- If EXPR is omitted, returns process group of current process.
- .Ip "getppid" 8 4
- Returns the process id of the parent process.
- .Ip "getpriority(WHICH,WHO)" 8 4
- Returns the current priority for a process, a process group, or a user.
- (See getpriority(2).)
- Will produce a fatal error if used on a machine that doesn't implement
- getpriority(2).
- .Ip "getpwnam(NAME)" 8
- .Ip "getgrnam(NAME)" 8
- .Ip "gethostbyname(NAME)" 8
- .Ip "getnetbyname(NAME)" 8
- .Ip "getprotobyname(NAME)" 8
- .Ip "getpwuid(UID)" 8
- .Ip "getgrgid(GID)" 8
- .Ip "getservbyname(NAME,PROTO)" 8
- .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
- .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
- .Ip "getprotobynumber(NUMBER)" 8
- .Ip "getservbyport(PORT,PROTO)" 8
- .Ip "getpwent" 8
- .Ip "getgrent" 8
- .Ip "gethostent" 8
- .Ip "getnetent" 8
- .Ip "getprotoent" 8
- .Ip "getservent" 8
- .Ip "setpwent" 8
- .Ip "setgrent" 8
- .Ip "sethostent(STAYOPEN)" 8
- .Ip "setnetent(STAYOPEN)" 8
- .Ip "setprotoent(STAYOPEN)" 8
- .Ip "setservent(STAYOPEN)" 8
- .Ip "endpwent" 8
- .Ip "endgrent" 8
- .Ip "endhostent" 8
- .Ip "endnetent" 8
- .Ip "endprotoent" 8
- .Ip "endservent" 8
- These routines perform the same functions as their counterparts in the
- system library.
- The return values from the various get routines are as follows:
- .nf
-
- ($name,$passwd,$uid,$gid,
- $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
- ($name,$passwd,$gid,$members) = getgr.\|.\|.
- ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
- ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
- ($name,$aliases,$proto) = getproto.\|.\|.
- ($name,$aliases,$port,$proto) = getserv.\|.\|.
-
- .fi
- The $members value returned by getgr.\|.\|. is a space separated list
- of the login names of the members of the group.
- .Sp
- The @addrs value returned by the gethost.\|.\|. functions is a list of the
- raw addresses returned by the corresponding system library call.
- In the Internet domain, each address is four bytes long and you can unpack
- it by saying something like:
- .nf
-
- ($a,$b,$c,$d) = unpack('C4',$addr[0]);
-
- .fi
- .Ip "getsockname(SOCKET)" 8 3
- Returns the packed sockaddr address of this end of the SOCKET connection.
- .nf
-
- .ne 4
- # An internet sockaddr
- $sockaddr = 'S n a4 x8';
- $mysockaddr = getsockname(S);
- .ie t \{\
- ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
- 'br\}
- .el \{\
- ($family, $port, $myaddr) =
- unpack($sockaddr,$mysockaddr);
- 'br\}
-
- .fi
- .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
- Returns the socket option requested, or undefined if there is an error.
- .Ip "gmtime(EXPR)" 8 4
- .Ip "gmtime EXPR" 8
- Converts a time as returned by the time function to a 9-element array with
- the time analyzed for the Greenwich timezone.
- Typically used as follows:
- .nf
-
- .ne 3
- .ie t \{\
- ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
- 'br\}
- .el \{\
- ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
- gmtime(time);
- 'br\}
-
- .fi
- All array elements are numeric, and come straight out of a struct tm.
- In particular this means that $mon has the range 0.\|.11 and $wday has the
- range 0.\|.6.
- If EXPR is omitted, does gmtime(time).
- .Ip "goto LABEL" 8 6
- Finds the statement labeled with LABEL and resumes execution there.
- Currently you may only go to statements in the main body of the program
- that are not nested inside a do {} construct.
- This statement is not implemented very efficiently, and is here only to make
- the
- .IR sed -to- perl
- translator easier.
- I may change its semantics at any time, consistent with support for translated
- .I sed
- scripts.
- Use it at your own risk.
- Better yet, don't use it at all.
- .Ip "grep(EXPR,LIST)" 8 4
- Evaluates EXPR for each element of LIST (locally setting $_ to each element)
- and returns the array value consisting of those elements for which the
- expression evaluated to true.
- In a scalar context, returns the number of times the expression was true.
- .nf
-
- @foo = grep(!/^#/, @bar); # weed out comments
-
- .fi
- Note that, since $_ is a reference into the array value, it can be
- used to modify the elements of the array.
- While this is useful and supported, it can cause bizarre results if
- the LIST is not a named array.
- .Ip "hex(EXPR)" 8 4
- .Ip "hex EXPR" 8
- Returns the decimal value of EXPR interpreted as an hex string.
- (To interpret strings that might start with 0 or 0x see oct().)
- If EXPR is omitted, uses $_.
- .Ip "index(STR,SUBSTR,POSITION)" 8 4
- .Ip "index(STR,SUBSTR)" 8 4
- Returns the position of the first occurrence of SUBSTR in STR at or after
- POSITION.
- If POSITION is omitted, starts searching from the beginning of the string.
- The return value is based at 0, or whatever you've
- set the $[ variable to.
- If the substring is not found, returns one less than the base, ordinarily \-1.
- .Ip "int(EXPR)" 8 4
- .Ip "int EXPR" 8
- Returns the integer portion of EXPR.
- If EXPR is omitted, uses $_.
- .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
- Implements the ioctl(2) function.
- You'll probably have to say
- .nf
-
- require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph
-
- .fi
- first to get the correct function definitions.
- If ioctl.ph doesn't exist or doesn't have the correct definitions
- you'll have to roll
- your own, based on your C header files such as <sys/ioctl.h>.
- (There is a perl script called h2ph that comes with the perl kit
- which may help you in this.)
- SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
- to the string value of SCALAR will be passed as the third argument of
- the actual ioctl call.
- (If SCALAR has no string value but does have a numeric value, that value
- will be passed rather than a pointer to the string value.
- To guarantee this to be true, add a 0 to the scalar before using it.)
- The pack() and unpack() functions are useful for manipulating the values
- of structures used by ioctl().
- The following example sets the erase character to DEL.
- .nf
-
- .ne 9
- require 'ioctl.ph';
- $sgttyb_t = "ccccs"; # 4 chars and a short
- if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
- @ary = unpack($sgttyb_t,$sgttyb);
- $ary[2] = 127;
- $sgttyb = pack($sgttyb_t,@ary);
- ioctl(STDIN,$TIOCSETP,$sgttyb)
- || die "Can't ioctl: $!";
- }
-
- .fi
- The return value of ioctl (and fcntl) is as follows:
- .nf
-
- .ne 4
- if OS returns:\h'|3i'perl returns:
- -1\h'|3i' undefined value
- 0\h'|3i' string "0 but true"
- anything else\h'|3i' that number
-
- .fi
- Thus perl returns true on success and false on failure, yet you can still
- easily determine the actual value returned by the operating system:
- .nf
-
- ($retval = ioctl(...)) || ($retval = -1);
- printf "System returned %d\en", $retval;
- .fi
- .Ip "join(EXPR,LIST)" 8 8
- .Ip "join(EXPR,ARRAY)" 8
- Joins the separate strings of LIST or ARRAY into a single string with fields
- separated by the value of EXPR, and returns the string.
- Example:
- .nf
-
- .ie t \{\
- $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
- 'br\}
- .el \{\
- $_ = join(\|\':\',
- $login,$passwd,$uid,$gid,$gcos,$home,$shell);
- 'br\}
-
- .fi
- See
- .IR split .
- .Ip "keys(ASSOC_ARRAY)" 8 6
- .Ip "keys ASSOC_ARRAY" 8
- Returns a normal array consisting of all the keys of the named associative
- array.
- The keys are returned in an apparently random order, but it is the same order
- as either the values() or each() function produces (given that the associative array
- has not been modified).
- Here is yet another way to print your environment:
- .nf
-
- .ne 5
- @keys = keys %ENV;
- @values = values %ENV;
- while ($#keys >= 0) {
- print pop(@keys), \'=\', pop(@values), "\en";
- }
-
- or how about sorted by key:
-
- .ne 3
- foreach $key (sort(keys %ENV)) {
- print $key, \'=\', $ENV{$key}, "\en";
- }
-
- .fi
- .Ip "kill(LIST)" 8 8
- .Ip "kill LIST" 8 2
- Sends a signal to a list of processes.
- The first element of the list must be the signal to send.
- Returns the number of processes successfully signaled.
- .nf
-
- $cnt = kill 1, $child1, $child2;
- kill 9, @goners;
-
- .fi
- If the signal is negative, kills process groups instead of processes.
- (On System V, a negative \fIprocess\fR number will also kill process groups,
- but that's not portable.)
- You may use a signal name in quotes.
- .Ip "last LABEL" 8 8
- .Ip "last" 8
- The
- .I last
- command is like the
- .I break
- statement in C (as used in loops); it immediately exits the loop in question.
- If the LABEL is omitted, the command refers to the innermost enclosing loop.
- The
- .I continue
- block, if any, is not executed:
- .nf
-
- .ne 4
- line: while (<STDIN>) {
- last line if /\|^$/; # exit when done with header
- .\|.\|.
- }
-
- .fi
- .Ip "length(EXPR)" 8 4
- .Ip "length EXPR" 8
- Returns the length in characters of the value of EXPR.
- If EXPR is omitted, returns length of $_.
- .Ip "link(OLDFILE,NEWFILE)" 8 2
- Creates a new filename linked to the old filename.
- Returns 1 for success, 0 otherwise.
- .Ip "listen(SOCKET,QUEUESIZE)" 8 2
- Does the same thing that the listen system call does.
- Returns true if it succeeded, false otherwise.
- See example in section on Interprocess Communication.
- .Ip "local(LIST)" 8 4
- Declares the listed variables to be local to the enclosing block,
- subroutine, eval or \*(L"do\*(R".
- All the listed elements must be legal lvalues.
- This operator works by saving the current values of those variables in LIST
- on a hidden stack and restoring them upon exiting the block, subroutine or eval.
- This means that called subroutines can also reference the local variable,
- but not the global one.
- The LIST may be assigned to if desired, which allows you to initialize
- your local variables.
- (If no initializer is given for a particular variable, it is created with
- an undefined value.)
- Commonly this is used to name the parameters to a subroutine.
- Examples:
- .nf
-
- .ne 13
- sub RANGEVAL {
- local($min, $max, $thunk) = @_;
- local($result) = \'\';
- local($i);
-
- # Presumably $thunk makes reference to $i
-
- for ($i = $min; $i < $max; $i++) {
- $result .= eval $thunk;
- }
-
- $result;
- }
-
- .ne 6
- if ($sw eq \'-v\') {
- # init local array with global array
- local(@ARGV) = @ARGV;
- unshift(@ARGV,\'echo\');
- system @ARGV;
- }
- # @ARGV restored
-
- .ne 6
- # temporarily add to digits associative array
- if ($base12) {
- # (NOTE: not claiming this is efficient!)
- local(%digits) = (%digits,'t',10,'e',11);
- do parse_num();
- }
-
- .fi
- Note that local() is a run-time command, and so gets executed every time
- through a loop, using up more stack storage each time until it's all
- released at once when the loop is exited.
- .Ip "localtime(EXPR)" 8 4
- .Ip "localtime EXPR" 8
- Converts a time as returned by the time function to a 9-element array with
- the time analyzed for the local timezone.
- Typically used as follows:
- .nf
-
- .ne 3
- .ie t \{\
- ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
- 'br\}
- .el \{\
- ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
- localtime(time);
- 'br\}
-
- .fi
- All array elements are numeric, and come straight out of a struct tm.
- In particular this means that $mon has the range 0.\|.11 and $wday has the
- range 0.\|.6.
- If EXPR is omitted, does localtime(time).
- .Ip "log(EXPR)" 8 4
- .Ip "log EXPR" 8
- Returns logarithm (base
- .IR e )
- of EXPR.
- If EXPR is omitted, returns log of $_.
- .Ip "lstat(FILEHANDLE)" 8 6
- .Ip "lstat FILEHANDLE" 8
- .Ip "lstat(EXPR)" 8
- .Ip "lstat SCALARVARIABLE" 8
- Does the same thing as the stat() function, but stats a symbolic link
- instead of the file the symbolic link points to.
- If symbolic links are unimplemented on your system, a normal stat is done.
- .Ip "m/PATTERN/gio" 8 4
- .Ip "/PATTERN/gio" 8
- Searches a string for a pattern match, and returns true (1) or false (\'\').
- If no string is specified via the =~ or !~ operator,
- the $_ string is searched.
- (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
- See also the section on regular expressions.
- .Sp
- If / is the delimiter then the initial \*(L'm\*(R' is optional.
- With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
- as delimiters.
- This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
- If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
- done in a case-insensitive manner.
- PATTERN may contain references to scalar variables, which will be interpolated
- (and the pattern recompiled) every time the pattern search is evaluated.
- (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
- If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
- the trailing delimiter.
- This avoids expensive run-time recompilations, and
- is useful when the value you are interpolating won't change over the
- life of the script.
- If the PATTERN evaluates to a null string, the most recent successful
- regular expression is used instead.
- .Sp
- If used in a context that requires an array value, a pattern match returns an
- array consisting of the subexpressions matched by the parentheses in the
- pattern,
- i.e. ($1, $2, $3.\|.\|.).
- It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
- or $'.
- If the match fails, a null array is returned.
- If the match succeeds, but there were no parentheses, an array value of (1)
- is returned.
- .Sp
- Examples:
- .nf
-
- .ne 4
- open(tty, \'/dev/tty\');
- <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired
-
- if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
-
- next if m#^/usr/spool/uucp#;
-
- .ne 5
- # poor man's grep
- $arg = shift;
- while (<>) {
- print if /$arg/o; # compile only once
- }
-
- if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
-
- .fi
- This last example splits $foo into the first two words and the remainder
- of the line, and assigns those three fields to $F1, $F2 and $Etc.
- The conditional is true if any variables were assigned, i.e. if the pattern
- matched.
- .Sp
- The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
- matching as many times as possible within the string. How it behaves
- depends on the context. In an array context, it returns a list of
- all the substrings matched by all the parentheses in the regular expression.
- If there are no parentheses, it returns a list of all the matched strings,
- as if there were parentheses around the whole pattern. In a scalar context,
- it iterates through the string, returning TRUE each time it matches, and
- FALSE when it eventually runs out of matches. (In other words, it remembers
- where it left off last time and restarts the search at that point.) It
- presumes that you have not modified the string since the last match.
- Modifying the string between matches may result in undefined behavior.
- (You can actually get away with in-place modifications via substr()
- that do not change the length of the entire string. In general, however,
- you should be using s///g for such modifications.) Examples:
- .nf
-
- # array context
- ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
-
- # scalar context
- $/ = 1; $* = 1;
- while ($paragraph = <>) {
- while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
- $sentences++;
- }
- }
- print "$sentences\en";
-
- .fi
- .Ip "mkdir(FILENAME,MODE)" 8 3
- Creates the directory specified by FILENAME, with permissions specified by
- MODE (as modified by umask).
- If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
- .Ip "msgctl(ID,CMD,ARG)" 8 4
- Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG
- must be a variable which will hold the returned msqid_ds structure.
- Returns like ioctl: the undefined value for error, "0 but true" for
- zero, or the actual return value otherwise.
- .Ip "msgget(KEY,FLAGS)" 8 4
- Calls the System V IPC function msgget. Returns the message queue id,
- or the undefined value if there is an error.
- .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
- Calls the System V IPC function msgsnd to send the message MSG to the
- message queue ID. MSG must begin with the long integer message type,
- which may be created with pack("L", $type). Returns true if
- successful, or false if there is an error.
- .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
- Calls the System V IPC function msgrcv to receive a message from
- message queue ID into variable VAR with a maximum message size of
- SIZE. Note that if a message is received, the message type will be
- the first thing in VAR, and the maximum length of VAR is SIZE plus the
- size of the message type. Returns true if successful, or false if
- there is an error.
- .Ip "next LABEL" 8 8
- .Ip "next" 8
- The
- .I next
- command is like the
- .I continue
- statement in C; it starts the next iteration of the loop:
- .nf
-
- .ne 4
- line: while (<STDIN>) {
- next line if /\|^#/; # discard comments
- .\|.\|.
- }
-
- .fi
- Note that if there were a
- .I continue
- block on the above, it would get executed even on discarded lines.
- If the LABEL is omitted, the command refers to the innermost enclosing loop.
- .Ip "oct(EXPR)" 8 4
- .Ip "oct EXPR" 8
- Returns the decimal value of EXPR interpreted as an octal string.
- (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
- The following will handle decimal, octal and hex in the standard notation:
- .nf
-
- $val = oct($val) if $val =~ /^0/;
-
- .fi
- If EXPR is omitted, uses $_.
- .Ip "open(FILEHANDLE,EXPR)" 8 8
- .Ip "open(FILEHANDLE)" 8
- .Ip "open FILEHANDLE" 8
- Opens the file whose filename is given by EXPR, and associates it with
- FILEHANDLE.
- If FILEHANDLE is an expression, its value is used as the name of the
- real filehandle wanted.
- If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
- contains the filename.
- If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
- input.
- If the filename begins with \*(L">\*(R", the file is opened for output.
- If the filename begins with \*(L">>\*(R", the file is opened for appending.
- (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
- want both read and write access to the file.)
- If the filename begins with \*(L"|\*(R", the filename is interpreted
- as a command to which output is to be piped, and if the filename ends
- with a \*(L"|\*(R", the filename is interpreted as command which pipes
- input to us.
- (You may not have a command that pipes both in and out.)
- Opening \'\-\' opens
- .I STDIN
- and opening \'>\-\' opens
- .IR STDOUT .
- Open returns non-zero upon success, the undefined value otherwise.
- If the open involved a pipe, the return value happens to be the pid
- of the subprocess.
- Examples:
- .nf
-
- .ne 3
- $article = 100;
- open article || die "Can't find article $article: $!\en";
- while (<article>) {\|.\|.\|.
-
- .ie t \{\
- open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved)
- 'br\}
- .el \{\
- open(LOG, \'>>/usr/spool/news/twitlog\'\|);
- # (log is reserved)
- 'br\}
-
- .ie t \{\
- open(article, "caesar <$article |"\|); # decrypt article
- 'br\}
- .el \{\
- open(article, "caesar <$article |"\|);
- # decrypt article
- 'br\}
-
- .ie t \{\
- open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
- 'br\}
- .el \{\
- open(extract, "|sort >/tmp/Tmp$$"\|);
- # $$ is our process#
- 'br\}
-
- .ne 7
- # process argument list of files along with any includes
-
- foreach $file (@ARGV) {
- do process($file, \'fh00\'); # no pun intended
- }
-
- sub process {
- local($filename, $input) = @_;
- $input++; # this is a string increment
- unless (open($input, $filename)) {
- print STDERR "Can't open $filename: $!\en";
- return;
- }
- .ie t \{\
- while (<$input>) { # note the use of indirection
- 'br\}
- .el \{\
- while (<$input>) { # note use of indirection
- 'br\}
- if (/^#include "(.*)"/) {
- do process($1, $input);
- next;
- }
- .\|.\|. # whatever
- }
- }
-
- .fi
- You may also, in the Bourne shell tradition, specify an EXPR beginning
- with \*(L">&\*(R", in which case the rest of the string
- is interpreted as the name of a filehandle
- (or file descriptor, if numeric) which is to be duped and opened.
- You may use & after >, >>, <, +>, +>> and +<.
- The mode you specify should match the mode of the original filehandle.
- Here is a script that saves, redirects, and restores
- .I STDOUT
- and
- .IR STDERR :
- .nf
-
- .ne 21
- #!/usr/bin/perl
- open(SAVEOUT, ">&STDOUT");
- open(SAVEERR, ">&STDERR");
-
- open(STDOUT, ">foo.out") || die "Can't redirect stdout";
- open(STDERR, ">&STDOUT") || die "Can't dup stdout";
-
- select(STDERR); $| = 1; # make unbuffered
- select(STDOUT); $| = 1; # make unbuffered
-
- print STDOUT "stdout 1\en"; # this works for
- print STDERR "stderr 1\en"; # subprocesses too
-
- close(STDOUT);
- close(STDERR);
-
- open(STDOUT, ">&SAVEOUT");
- open(STDERR, ">&SAVEERR");
-
- print STDOUT "stdout 2\en";
- print STDERR "stderr 2\en";
-
- .fi
- If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
- then there is an implicit fork done, and the return value of open
- is the pid of the child within the parent process, and 0 within the child
- process.
- (Use defined($pid) to determine if the open was successful.)
- The filehandle behaves normally for the parent, but i/o to that
- filehandle is piped from/to the
- .IR STDOUT / STDIN
- of the child process.
- In the child process the filehandle isn't opened\*(--i/o happens from/to
- the new
- .I STDOUT
- or
- .IR STDIN .
- Typically this is used like the normal piped open when you want to exercise
- more control over just how the pipe command gets executed, such as when
- you are running setuid, and don't want to have to scan shell commands
- for metacharacters.
- The following pairs are more or less equivalent:
- .nf
-
- .ne 5
- open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
- open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
-
- open(FOO, "cat \-n '$file'|");
- open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
-
- .fi
- Explicitly closing any piped filehandle causes the parent process to wait for the
- child to finish, and returns the status value in $?.
- Note: on any operation which may do a fork,
- unflushed buffers remain unflushed in both
- processes, which means you may need to set $| to
- avoid duplicate output.
- .Sp
- The filename that is passed to open will have leading and trailing
- whitespace deleted.
- In order to open a file with arbitrary weird characters in it, it's necessary
- to protect any leading and trailing whitespace thusly:
- .nf
-
- .ne 2
- $file =~ s#^(\es)#./$1#;
- open(FOO, "< $file\e0");
-
- .fi
- .Ip "opendir(DIRHANDLE,EXPR)" 8 3
- Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
- rewinddir() and closedir().
- Returns true if successful.
- DIRHANDLEs have their own namespace separate from FILEHANDLEs.
- .Ip "ord(EXPR)" 8 4
- .Ip "ord EXPR" 8
- Returns the numeric ascii value of the first character of EXPR.
- If EXPR is omitted, uses $_.
- ''' Comments on f & d by gnb@melba.bby.oz.au 22/11/89
- .Ip "pack(TEMPLATE,LIST)" 8 4
- Takes an array or list of values and packs it into a binary structure,
- returning the string containing the structure.
- The TEMPLATE is a sequence of characters that give the order and type
- of values, as follows:
- .nf
-
- A An ascii string, will be space padded.
- a An ascii string, will be null padded.
- c A signed char value.
- C An unsigned char value.
- s A signed short value.
- S An unsigned short value.
- i A signed integer value.
- I An unsigned integer value.
- l A signed long value.
- L An unsigned long value.
- n A short in \*(L"network\*(R" order.
- N A long in \*(L"network\*(R" order.
- f A single-precision float in the native format.
- d A double-precision float in the native format.
- p A pointer to a string.
- v A short in \*(L"VAX\*(R" (little-endian) order.
- V A long in \*(L"VAX\*(R" (little-endian) order.
- x A null byte.
- X Back up a byte.
- @ Null fill to absolute position.
- u A uuencoded string.
- b A bit string (ascending bit order, like vec()).
- B A bit string (descending bit order).
- h A hex string (low nybble first).
- H A hex string (high nybble first).
-
- .fi
- Each letter may optionally be followed by a number which gives a repeat
- count.
- With all types except "a", "A", "b", "B", "h" and "H",
- the pack function will gobble up that many values
- from the LIST.
- A * for the repeat count means to use however many items are left.
- The "a" and "A" types gobble just one value, but pack it as a string of length
- count,
- padding with nulls or spaces as necessary.
- (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
- Likewise, the "b" and "B" fields pack a string that many bits long.
- The "h" and "H" fields pack a string that many nybbles long.
- Real numbers (floats and doubles) are in the native machine format
- only; due to the multiplicity of floating formats around, and the lack
- of a standard \*(L"network\*(R" representation, no facility for
- interchange has been made.
- This means that packed floating point data
- written on one machine may not be readable on another - even if both
- use IEEE floating point arithmetic (as the endian-ness of the memory
- representation is not part of the IEEE spec).
- Note that perl uses
- doubles internally for all numeric calculation, and converting from
- double -> float -> double will lose precision (i.e. unpack("f",
- pack("f", $foo)) will not in general equal $foo).
- .br
- Examples:
- .nf
-
- $foo = pack("cccc",65,66,67,68);
- # foo eq "ABCD"
- $foo = pack("c4",65,66,67,68);
- # same thing
-
- $foo = pack("ccxxcc",65,66,67,68);
- # foo eq "AB\e0\e0CD"
-
- $foo = pack("s2",1,2);
- # "\e1\e0\e2\e0" on little-endian
- # "\e0\e1\e0\e2" on big-endian
-
- $foo = pack("a4","abcd","x","y","z");
- # "abcd"
-
- $foo = pack("aaaa","abcd","x","y","z");
- # "axyz"
-
- $foo = pack("a14","abcdefg");
- # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
-
- $foo = pack("i9pl", gmtime);
- # a real struct tm (on my system anyway)
-
- sub bintodec {
- unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
- }
- .fi
- The same template may generally also be used in the unpack function.
- .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
- Opens a pair of connected pipes like the corresponding system call.
- Note that if you set up a loop of piped processes, deadlock can occur
- unless you are very careful.
- In addition, note that perl's pipes use stdio buffering, so you may need
- to set $| to flush your WRITEHANDLE after each command, depending on
- the application.
- [Requires version 3.0 patchlevel 9.]
- .Ip "pop(ARRAY)" 8
- .Ip "pop ARRAY" 8 6
- Pops and returns the last value of the array, shortening the array by 1.
- Has the same effect as
- .nf
-
- $tmp = $ARRAY[$#ARRAY\-\|\-];
-
- .fi
- If there are no elements in the array, returns the undefined value.
- .Ip "print(FILEHANDLE LIST)" 8 10
- .Ip "print(LIST)" 8
- .Ip "print FILEHANDLE LIST" 8
- .Ip "print LIST" 8
- .Ip "print" 8
- Prints a string or a comma-separated list of strings.
- Returns non-zero if successful.
- FILEHANDLE may be a scalar variable name, in which case the variable contains
- the name of the filehandle, thus introducing one level of indirection.
- (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
- misinterpreted as an operator unless you interpose a + or put parens around
- the arguments.)
- If FILEHANDLE is omitted, prints by default to standard output (or to the
- last selected output channel\*(--see select()).
- If LIST is also omitted, prints $_ to
- .IR STDOUT .
- To set the default output channel to something other than
- .I STDOUT
- use the select operation.
- Note that, because print takes a LIST, anything in the LIST is evaluated
- in an array context, and any subroutine that you call will have one or more
- of its expressions evaluated in an array context.
- Also be careful not to follow the print keyword with a left parenthesis
- unless you want the corresponding right parenthesis to terminate the
- arguments to the print\*(--interpose a + or put parens around all the arguments.
- .Ip "printf(FILEHANDLE LIST)" 8 10
- .Ip "printf(LIST)" 8
- .Ip "printf FILEHANDLE LIST" 8
- .Ip "printf LIST" 8
- Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
- .Ip "push(ARRAY,LIST)" 8 7
- Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
- onto the end of ARRAY.
- The length of ARRAY increases by the length of LIST.
- Has the same effect as
- .nf
-
- for $value (LIST) {
- $ARRAY[++$#ARRAY] = $value;
- }
-
- .fi
- but is more efficient.
- .Ip "q/STRING/" 8 5
- .Ip "qq/STRING/" 8
- .Ip "qx/STRING/" 8
- These are not really functions, but simply syntactic sugar to let you
- avoid putting too many backslashes into quoted strings.
- The q operator is a generalized single quote, and the qq operator a
- generalized double quote.
- The qx operator is a generalized backquote.
- Any non-alphanumeric delimiter can be used in place of /, including newline.
- If the delimiter is an opening bracket or parenthesis, the final delimiter
- will be the corresponding closing bracket or parenthesis.
- (Embedded occurrences of the closing bracket need to be backslashed as usual.)
- Examples:
- .nf
-
- .ne 5
- $foo = q!I said, "You said, \'She said it.\'"!;
- $bar = q(\'This is it.\');
- $today = qx{ date };
- $_ .= qq
- *** The previous line contains the naughty word "$&".\en
- if /(ibm|apple|awk)/; # :-)
-
- .fi
- .Ip "rand(EXPR)" 8 8
- .Ip "rand EXPR" 8
- .Ip "rand" 8
- Returns a random fractional number between 0 and the value of EXPR.
- (EXPR should be positive.)
- If EXPR is omitted, returns a value between 0 and 1.
- See also srand().
- .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
- .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
- Attempts to read LENGTH bytes of data into variable SCALAR from the specified
- FILEHANDLE.
- Returns the number of bytes actually read, or undef if there was an error.
- SCALAR will be grown or shrunk to the length actually read.
- An OFFSET may be specified to place the read data at some other place
- than the beginning of the string.
- This call is actually implemented in terms of stdio's fread call. To get
- a true read system call, see sysread.
- .Ip "readdir(DIRHANDLE)" 8 3
- .Ip "readdir DIRHANDLE" 8
- Returns the next directory entry for a directory opened by opendir().
- If used in an array context, returns all the rest of the entries in the
- directory.
- If there are no more entries, returns an undefined value in a scalar context
- or a null list in an array context.
- .Ip "readlink(EXPR)" 8 6
- .Ip "readlink EXPR" 8
- Returns the value of a symbolic link, if symbolic links are implemented.
- If not, gives a fatal error.
- If there is some system error, returns the undefined value and sets $! (errno).
- If EXPR is omitted, uses $_.
- .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
- Receives a message on a socket.
- Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
- SOCKET filehandle.
- Returns the address of the sender, or the undefined value if there's an error.
- SCALAR will be grown or shrunk to the length actually read.
- Takes the same flags as the system call of the same name.
- .Ip "redo LABEL" 8 8
- .Ip "redo" 8
- The
- .I redo
- command restarts the loop block without evaluating the conditional again.
- The
- .I continue
- block, if any, is not executed.
- If the LABEL is omitted, the command refers to the innermost enclosing loop.
- This command is normally used by programs that want to lie to themselves
- about what was just input:
- .nf
-
- .ne 16
- # a simpleminded Pascal comment stripper
- # (warning: assumes no { or } in strings)
- line: while (<STDIN>) {
- while (s|\|({.*}.*\|){.*}|$1 \||) {}
- s|{.*}| \||;
- if (s|{.*| \||) {
- $front = $_;
- while (<STDIN>) {
- if (\|/\|}/\|) { # end of comment?
- s|^|$front{|;
- redo line;
- }
- }
- }
- print;
- }
-
- .fi
- .Ip "rename(OLDNAME,NEWNAME)" 8 2
- Changes the name of a file.
- Returns 1 for success, 0 otherwise.
- Will not work across filesystem boundaries.
- .Ip "require(EXPR)" 8 6
- .Ip "require EXPR" 8
- .Ip "require" 8
- Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
- Has semantics similar to the following subroutine:
- .nf
-
- sub require {
- local($filename) = @_;
- return 1 if $INC{$filename};
- local($realfilename,$result);
- ITER: {
- foreach $prefix (@INC) {
- $realfilename = "$prefix/$filename";
- if (-f $realfilename) {
- $result = do $realfilename;
- last ITER;
- }
- }
- die "Can't find $filename in \e@INC";
- }
- die $@ if $@;
- die "$filename did not return true value" unless $result;
- $INC{$filename} = $realfilename;
- $result;
- }
-
- .fi
- Note that the file will not be included twice under the same specified name.
- .Ip "reset(EXPR)" 8 6
- .Ip "reset EXPR" 8
- .Ip "reset" 8
- Generally used in a
- .I continue
- block at the end of a loop to clear variables and reset ?? searches
- so that they work again.
- The expression is interpreted as a list of single characters (hyphens allowed
- for ranges).
- All variables and arrays beginning with one of those letters are reset to
- their pristine state.
- If the expression is omitted, one-match searches (?pattern?) are reset to
- match again.
- Only resets variables or searches in the current package.
- Always returns 1.
- Examples:
- .nf
-
- .ne 3
- reset \'X\'; \h'|2i'# reset all X variables
- reset \'a\-z\';\h'|2i'# reset lower case variables
- reset; \h'|2i'# just reset ?? searches
-
- .fi
- Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
- arrays.
- .Sp
- The use of reset on dbm associative arrays does not change the dbm file.
- (It does, however, flush any entries cached by perl, which may be useful if
- you are sharing the dbm file.
- Then again, maybe not.)
- .Ip "return LIST" 8 3
- Returns from a subroutine with the value specified.
- (Note that a subroutine can automatically return
- the value of the last expression evaluated.
- That's the preferred method\*(--use of an explicit
- .I return
- is a bit slower.)
- .Ip "reverse(LIST)" 8 4
- .Ip "reverse LIST" 8
- In an array context, returns an array value consisting of the elements
- of LIST in the opposite order.
- In a scalar context, returns a string value consisting of the bytes of
- the first element of LIST in the opposite order.
- .Ip "rewinddir(DIRHANDLE)" 8 5
- .Ip "rewinddir DIRHANDLE" 8
- Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
- .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
- .Ip "rindex(STR,SUBSTR)" 8 4
- Works just like index except that it
- returns the position of the LAST occurrence of SUBSTR in STR.
- If POSITION is specified, returns the last occurrence at or before that
- position.
- .Ip "rmdir(FILENAME)" 8 4
- .Ip "rmdir FILENAME" 8
- Deletes the directory specified by FILENAME if it is empty.
- If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
- If FILENAME is omitted, uses $_.
- .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
- Searches a string for a pattern, and if found, replaces that pattern with the
- replacement text and returns the number of substitutions made.
- Otherwise it returns false (0).
- The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
- of the pattern are to be replaced.
- The \*(L"i\*(R" is also optional, and if present, indicates that matching
- is to be done in a case-insensitive manner.
- The \*(L"e\*(R" is likewise optional, and if present, indicates that
- the replacement string is to be evaluated as an expression rather than just
- as a double-quoted string.
- Any non-alphanumeric delimiter may replace the slashes;
- if single quotes are used, no
- interpretation is done on the replacement string (the e modifier overrides
- this, however); if backquotes are used, the replacement string is a command
- to execute whose output will be used as the actual replacement text.
- If no string is specified via the =~ or !~ operator,
- the $_ string is searched and modified.
- (The string specified with =~ must be a scalar variable, an array element,
- or an assignment to one of those, i.e. an lvalue.)
- If the pattern contains a $ that looks like a variable rather than an
- end-of-string test, the variable will be interpolated into the pattern at
- run-time.
- If you only want the pattern compiled once the first time the variable is
- interpolated, add an \*(L"o\*(R" at the end.
- If the PATTERN evaluates to a null string, the most recent successful
- regular expression is used instead.
- See also the section on regular expressions.
- Examples:
- .nf
-
- s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
-
- $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
-
- s/Login: $foo/Login: $bar/; # run-time pattern
-
- ($foo = $bar) =~ s/bar/foo/;
-
- $_ = \'abc123xyz\';
- s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R'
- s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R'
- s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R'
-
- s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
-
- .fi
- (Note the use of $ instead of \|\e\| in the last example. See section
- on regular expressions.)
- .Ip "scalar(EXPR)" 8 3
- Forces EXPR to be interpreted in a scalar context and returns the value
- of EXPR.
- .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
- Randomly positions the file pointer for FILEHANDLE, just like the fseek()
- call of stdio.
- FILEHANDLE may be an expression whose value gives the name of the filehandle.
- Returns 1 upon success, 0 otherwise.
- .Ip "seekdir(DIRHANDLE,POS)" 8 3
- Sets the current position for the readdir() routine on DIRHANDLE.
- POS must be a value returned by telldir().
- Has the same caveats about possible directory compaction as the corresponding
- system library routine.
- .Ip "select(FILEHANDLE)" 8 3
- .Ip "select" 8 3
- Returns the currently selected filehandle.
- Sets the current default filehandle for output, if FILEHANDLE is supplied.
- This has two effects: first, a
- .I write
- or a
- .I print
- without a filehandle will default to this FILEHANDLE.
- Second, references to variables related to output will refer to this output
- channel.
- For example, if you have to set the top of form format for more than
- one output channel, you might do the following:
- .nf
-
- .ne 4
- select(REPORT1);
- $^ = \'report1_top\';
- select(REPORT2);
- $^ = \'report2_top\';
-
- .fi
- FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
- Thus:
- .nf
-
- $oldfh = select(STDERR); $| = 1; select($oldfh);
-
- .fi
- .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
- This calls the select system call with the bitmasks specified, which can
- be constructed using fileno() and vec(), along these lines:
- .nf
-
- $rin = $win = $ein = '';
- vec($rin,fileno(STDIN),1) = 1;
- vec($win,fileno(STDOUT),1) = 1;
- $ein = $rin | $win;
-
- .fi
- If you want to select on many filehandles you might wish to write a subroutine:
- .nf
-
- sub fhbits {
- local(@fhlist) = split(' ',$_[0]);
- local($bits);
- for (@fhlist) {
- vec($bits,fileno($_),1) = 1;
- }
- $bits;
- }
- $rin = &fhbits('STDIN TTY SOCK');
-
- .fi
- The usual idiom is:
- .nf
-
- ($nfound,$timeleft) =
- select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
-
- or to block until something becomes ready:
-
- .ie t \{\
- $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
- 'br\}
- .el \{\
- $nfound = select($rout=$rin, $wout=$win,
- $eout=$ein, undef);
- 'br\}
-
- .fi
- Any of the bitmasks can also be undef.
- The timeout, if specified, is in seconds, which may be fractional.
- NOTE: not all implementations are capable of returning the $timeleft.
- If not, they always return $timeleft equal to the supplied $timeout.
- .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
- Calls the System V IPC function semctl. If CMD is &IPC_STAT or
- &GETALL, then ARG must be a variable which will hold the returned
- semid_ds structure or semaphore value array. Returns like ioctl: the
- undefined value for error, "0 but true" for zero, or the actual return
- value otherwise.
- .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
- Calls the System V IPC function semget. Returns the semaphore id, or
- the undefined value if there is an error.
- .Ip "semop(KEY,OPSTRING)" 8 4
- Calls the System V IPC function semop to perform semaphore operations
- such as signaling and waiting. OPSTRING must be a packed array of
- semop structures. Each semop structure can be generated with
- \&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore
- operations is implied by the length of OPSTRING. Returns true if
- successful, or false if there is an error. As an example, the
- following code waits on semaphore $semnum of semaphore id $semid:
- .nf
-
- $semop = pack("sss", $semnum, -1, 0);
- die "Semaphore trouble: $!\en" unless semop($semid, $semop);
-
- .fi
- To signal the semaphore, replace "-1" with "1".
- .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
- .Ip "send(SOCKET,MSG,FLAGS)" 8
- Sends a message on a socket.
- Takes the same flags as the system call of the same name.
- On unconnected sockets you must specify a destination to send TO.
- Returns the number of characters sent, or the undefined value if
- there is an error.
- .Ip "setpgrp(PID,PGRP)" 8 4
- Sets the current process group for the specified PID, 0 for the current
- process.
- Will produce a fatal error if used on a machine that doesn't implement
- setpgrp(2).
- .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
- Sets the current priority for a process, a process group, or a user.
- (See setpriority(2).)
- Will produce a fatal error if used on a machine that doesn't implement
- setpriority(2).
- .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
- Sets the socket option requested.
- Returns undefined if there is an error.
- OPTVAL may be specified as undef if you don't want to pass an argument.
- .Ip "shift(ARRAY)" 8 6
- .Ip "shift ARRAY" 8
- .Ip "shift" 8
- Shifts the first value of the array off and returns it,
- shortening the array by 1 and moving everything down.
- If there are no elements in the array, returns the undefined value.
- If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
- array in subroutines.
- (This is determined lexically.)
- See also unshift(), push() and pop().
- Shift() and unshift() do the same thing to the left end of an array that push()
- and pop() do to the right end.
- .Ip "shmctl(ID,CMD,ARG)" 8 4
- Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG
- must be a variable which will hold the returned shmid_ds structure.
- Returns like ioctl: the undefined value for error, "0 but true" for
- zero, or the actual return value otherwise.
- .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
- Calls the System V IPC function shmget. Returns the shared memory
- segment id, or the undefined value if there is an error.
- .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
- .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
- Reads or writes the System V shared memory segment ID starting at
- position POS for size SIZE by attaching to it, copying in/out, and
- detaching from it. When reading, VAR must be a variable which
- will hold the data read. When writing, if STRING is too long,
- only SIZE bytes are used; if STRING is too short, nulls are
- written to fill out SIZE bytes. Return true if successful, or
- false if there is an error.
- .Ip "shutdown(SOCKET,HOW)" 8 3
- Shuts down a socket connection in the manner indicated by HOW, which has
- the same interpretation as in the system call of the same name.
- .Ip "sin(EXPR)" 8 4
- .Ip "sin EXPR" 8
- Returns the sine of EXPR (expressed in radians).
- If EXPR is omitted, returns sine of $_.
- .Ip "sleep(EXPR)" 8 6
- .Ip "sleep EXPR" 8
- .Ip "sleep" 8
- Causes the script to sleep for EXPR seconds, or forever if no EXPR.
- May be interrupted by sending the process a SIGALRM.
- Returns the number of seconds actually slept.
- You probably cannot mix alarm() and sleep() calls, since sleep() is
- often implemented using alarm().
- .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
- Opens a socket of the specified kind and attaches it to filehandle SOCKET.
- DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
- of the same name.
- You may need to run h2ph on sys/socket.h to get the proper values handy
- in a perl library file.
- Return true if successful.
- See the example in the section on Interprocess Communication.
- .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
- Creates an unnamed pair of sockets in the specified domain, of the specified
- type.
- DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
- of the same name.
- If unimplemented, yields a fatal error.
- Return true if successful.
- .Ip "sort(SUBROUTINE LIST)" 8 9
- .Ip "sort(LIST)" 8
- .Ip "sort SUBROUTINE LIST" 8
- .Ip "sort BLOCK LIST" 8
- .Ip "sort LIST" 8
- Sorts the LIST and returns the sorted array value.
- Nonexistent values of arrays are stripped out.
- If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order.
- If SUBROUTINE is specified, gives the name of a subroutine that returns
- an integer less than, equal to, or greater than 0,
- depending on how the elements of the array are to be ordered.
- (The <=> and cmp operators are extremely useful in such routines.)
- SUBROUTINE may be a scalar variable name, in which case the value provides
- the name of the subroutine to use.
- In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous,
- in-line sort subroutine.
- .Sp
- In the interests of efficiency the normal calling code for subroutines
- is bypassed, with the following effects: the subroutine may not be a recursive
- subroutine, and the two elements to be compared are passed into the subroutine
- not via @_ but as $a and $b (see example below).
- They are passed by reference so don't modify $a and $b.
- .Sp
- Examples:
- .nf
-
- .ne 2
- # sort lexically
- @articles = sort @files;
-
- .ne 2
- # same thing, but with explicit sort routine
- @articles = sort {$a cmp $b;} @files;
-
- .ne 2
- # same thing in reversed order
- @articles = sort {$b cmp $a;} @files;
-
- .ne 2
- # sort numerically ascending
- @articles = sort {$a <=> $b;} @files;
-
- .ne 2
- # sort numerically descending
- @articles = sort {$b <=> $a;} @files;
-
- .ne 5
- # sort using explicit subroutine name
- sub byage {
- $age{$a} <=> $age{$b}; # presuming integers
- }
- @sortedclass = sort byage @class;
-
- .ne 9
- sub reverse { $b cmp $a; }
- @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
- @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
- print sort @harry;
- # prints AbelCaincatdogx
- print sort reverse @harry;
- # prints xdogcatCainAbel
- print sort @george, \'to\', @harry;
- # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
-
- .fi
- .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
- .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
- .Ip "splice(ARRAY,OFFSET)" 8
- Removes the elements designated by OFFSET and LENGTH from an array, and
- replaces them with the elements of LIST, if any.
- Returns the elements removed from the array.
- The array grows or shrinks as necessary.
- If LENGTH is omitted, removes everything from OFFSET onward.
- The following equivalencies hold (assuming $[ == 0):
- .nf
-
- push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
- pop(@a)\h'|3.5i'splice(@a,-1)
- shift(@a)\h'|3.5i'splice(@a,0,1)
- unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
- $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
-
- Example, assuming array lengths are passed before arrays:
-
- sub aeq { # compare two array values
- local(@a) = splice(@_,0,shift);
- local(@b) = splice(@_,0,shift);
- return 0 unless @a == @b; # same len?
- while (@a) {
- return 0 if pop(@a) ne pop(@b);
- }
- return 1;
- }
- if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
-
- .fi
- .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
- .Ip "split(/PATTERN/,EXPR)" 8 8
- .Ip "split(/PATTERN/)" 8
- .Ip "split" 8
- Splits a string into an array of strings, and returns it.
- (If not in an array context, returns the number of fields found and splits
- into the @_ array.
- (In an array context, you can force the split into @_
- by using ?? as the pattern delimiters, but it still returns the array value.))
- If EXPR is omitted, splits the $_ string.
- If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
- Anything matching PATTERN is taken to be a delimiter separating the fields.
- (Note that the delimiter may be longer than one character.)
- If LIMIT is specified, splits into no more than that many fields (though it
- may split into fewer).
- If LIMIT is unspecified, trailing null fields are stripped (which
- potential users of pop() would do well to remember).
- A pattern matching the null string (not to be confused with a null pattern //,
- which is just one member of the set of patterns matching a null string)
- will split the value of EXPR into separate characters at each point it
- matches that way.
- For example:
- .nf
-
- print join(\':\', split(/ */, \'hi there\'));
-
- .fi
- produces the output \*(L'h:i:t:h:e:r:e\*(R'.
- .Sp
- The LIMIT parameter can be used to partially split a line
- .nf
-
- ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
-
- .fi
- (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
- larger than the number of variables in the list, to avoid unnecessary work.
- For the list above LIMIT would have been 4 by default.
- In time critical applications it behooves you not to split into
- more fields than you really need.)
- .Sp
- If the PATTERN contains parentheses, additional array elements are created
- from each matching substring in the delimiter.
- .Sp
- split(/([,-])/,"1-10,20");
- .Sp
- produces the array value
- .Sp
- (1,'-',10,',',20)
- .Sp
- The pattern /PATTERN/ may be replaced with an expression to specify patterns
- that vary at runtime.
- (To do runtime compilation only once, use /$variable/o.)
- As a special case, specifying a space (\'\ \') will split on white space
- just as split with no arguments does, but leading white space does NOT
- produce a null first field.
- Thus, split(\'\ \') can be used to emulate
- .IR awk 's
- default behavior, whereas
- split(/\ /) will give you as many null initial fields as there are
- leading spaces.
- .Sp
- Example:
- .nf
-
- .ne 5
- open(passwd, \'/etc/passwd\');
- while (<passwd>) {
- .ie t \{\
- ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
- 'br\}
- .el \{\
- ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
- = split(\|/\|:\|/\|);
- 'br\}
- .\|.\|.
- }
-
- .fi
- (Note that $shell above will still have a newline on it. See chop().)
- See also
- .IR join .
- .Ip "sprintf(FORMAT,LIST)" 8 4
- Returns a string formatted by the usual printf conventions.
- The * character is not supported.
- .Ip "sqrt(EXPR)" 8 4
- .Ip "sqrt EXPR" 8
- Return the square root of EXPR.
- If EXPR is omitted, returns square root of $_.
- .Ip "srand(EXPR)" 8 4
- .Ip "srand EXPR" 8
- Sets the random number seed for the
- .I rand
- operator.
- If EXPR is omitted, does srand(time).
- .Ip "stat(FILEHANDLE)" 8 8
- .Ip "stat FILEHANDLE" 8
- .Ip "stat(EXPR)" 8
- .Ip "stat SCALARVARIABLE" 8
- Returns a 13-element array giving the statistics for a file, either the file
- opened via FILEHANDLE, or named by EXPR.
- Typically used as follows:
- .nf
-
- .ne 3
- ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
- $atime,$mtime,$ctime,$blksize,$blocks)
- = stat($filename);
-
- .fi
- If stat is passed the special filehandle consisting of an underline,
- no stat is done, but the current contents of the stat structure from
- the last stat or filetest are returned.
- Example:
- .nf
-
- .ne 3
- if (-x $file && (($d) = stat(_)) && $d < 0) {
- print "$file is executable NFS file\en";
- }
-
- .fi
- (This only works on machines for which the device number is negative under NFS.)
- .Ip "study(SCALAR)" 8 6
- .Ip "study SCALAR" 8
- .Ip "study"
- Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
- doing many pattern matches on the string before it is next modified.
- This may or may not save time, depending on the nature and number of patterns
- you are searching on, and on the distribution of character frequencies in
- the string to be searched\*(--you probably want to compare runtimes with and
- without it to see which runs faster.
- Those loops which scan for many short constant strings (including the constant
- parts of more complex patterns) will benefit most.
- You may have only one study active at a time\*(--if you study a different
- scalar the first is \*(L"unstudied\*(R".
- (The way study works is this: a linked list of every character in the string
- to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
- are.
- From each search string, the rarest character is selected, based on some
- static frequency tables constructed from some C programs and English text.
- Only those places that contain this \*(L"rarest\*(R" character are examined.)
- .Sp
- For example, here is a loop which inserts index producing entries before any line
- containing a certain pattern:
- .nf
-
- .ne 8
- while (<>) {
- study;
- print ".IX foo\en" if /\ebfoo\eb/;
- print ".IX bar\en" if /\ebbar\eb/;
- print ".IX blurfl\en" if /\ebblurfl\eb/;
- .\|.\|.
- print;
- }
-
- .fi
- In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
- will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
- In general, this is a big win except in pathological cases.
- The only question is whether it saves you more time than it took to build
- the linked list in the first place.
- .Sp
- Note that if you have to look for strings that you don't know till runtime,
- you can build an entire loop as a string and eval that to avoid recompiling
- all your patterns all the time.
- Together with undefining $/ to input entire files as one record, this can
- be very fast, often faster than specialized programs like fgrep.
- The following scans a list of files (@files)
- for a list of words (@words), and prints out the names of those files that
- contain a match:
- .nf
-
- .ne 12
- $search = \'while (<>) { study;\';
- foreach $word (@words) {
- $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
- }
- $search .= "}";
- @ARGV = @files;
- undef $/;
- eval $search; # this screams
- $/ = "\en"; # put back to normal input delim
- foreach $file (sort keys(%seen)) {
- print $file, "\en";
- }
-
- .fi
- .Ip "substr(EXPR,OFFSET,LEN)" 8 2
- .Ip "substr(EXPR,OFFSET)" 8 2
- Extracts a substring out of EXPR and returns it.
- First character is at offset 0, or whatever you've set $[ to.
- If OFFSET is negative, starts that far from the end of the string.
- If LEN is omitted, returns everything to the end of the string.
- You can use the substr() function as an lvalue, in which case EXPR must
- be an lvalue.
- If you assign something shorter than LEN, the string will shrink, and
- if you assign something longer than LEN, the string will grow to accommodate it.
- To keep the string the same length you may need to pad or chop your value using
- sprintf().
- .Ip "symlink(OLDFILE,NEWFILE)" 8 2
- Creates a new filename symbolically linked to the old filename.
- Returns 1 for success, 0 otherwise.
- On systems that don't support symbolic links, produces a fatal error at
- run time.
- To check for that, use eval:
- .nf
-
- $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
-
- .fi
- .Ip "syscall(LIST)" 8 6
- .Ip "syscall LIST" 8
- Calls the system call specified as the first element of the list, passing
- the remaining elements as arguments to the system call.
- If unimplemented, produces a fatal error.
- The arguments are interpreted as follows: if a given argument is numeric,
- the argument is passed as an int.
- If not, the pointer to the string value is passed.
- You are responsible to make sure a string is pre-extended long enough
- to receive any result that might be written into a string.
- If your integer arguments are not literals and have never been interpreted
- in a numeric context, you may need to add 0 to them to force them to look
- like numbers.
- .nf
-
- require 'syscall.ph'; # may need to run h2ph
- syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
-
- .fi
- .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
- .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
- Attempts to read LENGTH bytes of data into variable SCALAR from the specified
- FILEHANDLE, using the system call read(2).
- It bypasses stdio, so mixing this with other kinds of reads may cause
- confusion.
- Returns the number of bytes actually read, or undef if there was an error.
- SCALAR will be grown or shrunk to the length actually read.
- An OFFSET may be specified to place the read data at some other place
- than the beginning of the string.
- .Ip "system(LIST)" 8 6
- .Ip "system LIST" 8
- Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
- is done first, and the parent process waits for the child process to complete.
- Note that argument processing varies depending on the number of arguments.
- The return value is the exit status of the program as returned by the wait()
- call.
- To get the actual exit value divide by 256.
- See also
- .IR exec .
- .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
- .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
- Attempts to write LENGTH bytes of data from variable SCALAR to the specified
- FILEHANDLE, using the system call write(2).
- It bypasses stdio, so mixing this with prints may cause
- confusion.
- Returns the number of bytes actually written, or undef if there was an error.
- An OFFSET may be specified to place the read data at some other place
- than the beginning of the string.
- .Ip "tell(FILEHANDLE)" 8 6
- .Ip "tell FILEHANDLE" 8 6
- .Ip "tell" 8
- Returns the current file position for FILEHANDLE.
- FILEHANDLE may be an expression whose value gives the name of the actual
- filehandle.
- If FILEHANDLE is omitted, assumes the file last read.
- .Ip "telldir(DIRHANDLE)" 8 5
- .Ip "telldir DIRHANDLE" 8
- Returns the current position of the readdir() routines on DIRHANDLE.
- Value may be given to seekdir() to access a particular location in
- a directory.
- Has the same caveats about possible directory compaction as the corresponding
- system library routine.
- .Ip "time" 8 4
- Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
- Suitable for feeding to gmtime() and localtime().
- .Ip "times" 8 4
- Returns a four-element array giving the user and system times, in seconds, for this
- process and the children of this process.
- .Sp
- ($user,$system,$cuser,$csystem) = times;
- .Sp
- .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
- .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
- Translates all occurrences of the characters found in the search list with
- the corresponding character in the replacement list.
- It returns the number of characters replaced or deleted.
- If no string is specified via the =~ or !~ operator,
- the $_ string is translated.
- (The string specified with =~ must be a scalar variable, an array element,
- or an assignment to one of those, i.e. an lvalue.)
- For
- .I sed
- devotees,
- .I y
- is provided as a synonym for
- .IR tr .
- .Sp
- If the c modifier is specified, the SEARCHLIST character set is complemented.
- If the d modifier is specified, any characters specified by SEARCHLIST that
- are not found in REPLACEMENTLIST are deleted.
- (Note that this is slightly more flexible than the behavior of some
- .I tr
- programs, which delete anything they find in the SEARCHLIST, period.)
- If the s modifier is specified, sequences of characters that were translated
- to the same character are squashed down to 1 instance of the character.
- .Sp
- If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
- as specified.
- Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
- the final character is replicated till it is long enough.
- If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
- This latter is useful for counting characters in a class, or for squashing
- character sequences in a class.
- .Sp
- Examples:
- .nf
-
- $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case
-
- $cnt = tr/*/*/; \h'|3i'# count the stars in $_
-
- $cnt = tr/0\-9//; \h'|3i'# count the digits in $_
-
- tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper
-
- ($HOST = $host) =~ tr/a\-z/A\-Z/;
-
- y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space
-
- tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
-
- .fi
- .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
- .Ip "truncate(EXPR,LENGTH)" 8
- Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
- length.
- Produces a fatal error if truncate isn't implemented on your system.
- .Ip "umask(EXPR)" 8 4
- .Ip "umask EXPR" 8
- .Ip "umask" 8
- Sets the umask for the process and returns the old one.
- If EXPR is omitted, merely returns current umask.
- .Ip "undef(EXPR)" 8 6
- .Ip "undef EXPR" 8
- .Ip "undef" 8
- Undefines the value of EXPR, which must be an lvalue.
- Use only on a scalar value, an entire array, or a subroutine name (using &).
- (Undef will probably not do what you expect on most predefined variables or
- dbm array values.)
- Always returns the undefined value.
- You can omit the EXPR, in which case nothing is undefined, but you still
- get an undefined value that you could, for instance, return from a subroutine.
- Examples:
- .nf
-
- .ne 6
- undef $foo;
- undef $bar{'blurfl'};
- undef @ary;
- undef %assoc;
- undef &mysub;
- return (wantarray ? () : undef) if $they_blew_it;
-
- .fi
- .Ip "unlink(LIST)" 8 4
- .Ip "unlink LIST" 8
- Deletes a list of files.
- Returns the number of files successfully deleted.
- .nf
-
- .ne 2
- $cnt = unlink \'a\', \'b\', \'c\';
- unlink @goners;
- unlink <*.bak>;
-
- .fi
- Note: unlink will not delete directories unless you are superuser and the
- .B \-U
- flag is supplied to
- .IR perl .
- Even if these conditions are met, be warned that unlinking a directory
- can inflict damage on your filesystem.
- Use rmdir instead.
- .Ip "unpack(TEMPLATE,EXPR)" 8 4
- Unpack does the reverse of pack: it takes a string representing
- a structure and expands it out into an array value, returning the array
- value.
- (In a scalar context, it merely returns the first value produced.)
- The TEMPLATE has the same format as in the pack function.
- Here's a subroutine that does substring:
- .nf
-
- .ne 4
- sub substr {
- local($what,$where,$howmuch) = @_;
- unpack("x$where a$howmuch", $what);
- }
-
- .ne 3
- and then there's
-
- sub ord { unpack("c",$_[0]); }
-
- .fi
- In addition, you may prefix a field with a %<number> to indicate that
- you want a <number>-bit checksum of the items instead of the items themselves.
- Default is a 16-bit checksum.
- For example, the following computes the same number as the System V sum program:
- .nf
-
- .ne 4
- while (<>) {
- $checksum += unpack("%16C*", $_);
- }
- $checksum %= 65536;
-
- .fi
- .Ip "unshift(ARRAY,LIST)" 8 4
- Does the opposite of a
- .IR shift .
- Or the opposite of a
- .IR push ,
- depending on how you look at it.
- Prepends list to the front of the array, and returns the number of elements
- in the new array.
- .nf
-
- unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
-
- .fi
- .Ip "utime(LIST)" 8 2
- .Ip "utime LIST" 8 2
- Changes the access and modification times on each file of a list of files.
- The first two elements of the list must be the NUMERICAL access and
- modification times, in that order.
- Returns the number of files successfully changed.
- The inode modification time of each file is set to the current time.
- Example of a \*(L"touch\*(R" command:
- .nf
-
- .ne 3
- #!/usr/bin/perl
- $now = time;
- utime $now, $now, @ARGV;
-
- .fi
- .Ip "values(ASSOC_ARRAY)" 8 6
- .Ip "values ASSOC_ARRAY" 8
- Returns a normal array consisting of all the values of the named associative
- array.
- The values are returned in an apparently random order, but it is the same order
- as either the keys() or each() function would produce on the same array.
- See also keys() and each().
- .Ip "vec(EXPR,OFFSET,BITS)" 8 2
- Treats a string as a vector of unsigned integers, and returns the value
- of the bitfield specified.
- May also be assigned to.
- BITS must be a power of two from 1 to 32.
- .Sp
- Vectors created with vec() can also be manipulated with the logical operators
- |, & and ^,
- which will assume a bit vector operation is desired when both operands are
- strings.
- This interpretation is not enabled unless there is at least one vec() in
- your program, to protect older programs.
- .Sp
- To transform a bit vector into a string or array of 0's and 1's, use these:
- .nf
-
- $bits = unpack("b*", $vector);
- @bits = split(//, unpack("b*", $vector));
-
- .fi
- If you know the exact length in bits, it can be used in place of the *.
- .Ip "wait" 8 6
- Waits for a child process to terminate and returns the pid of the deceased
- process, or -1 if there are no child processes.
- The status is returned in $?.
- .Ip "waitpid(PID,FLAGS)" 8 6
- Waits for a particular child process to terminate and returns the pid of the deceased
- process, or -1 if there is no such child process.
- The status is returned in $?.
- If you say
- .nf
-
- require "sys/wait.h";
- .\|.\|.
- waitpid(-1,&WNOHANG);
-
- .fi
- then you can do a non-blocking wait for any process. Non-blocking wait
- is only available on machines supporting either the
- .I waitpid (2)
- or
- .I wait4 (2)
- system calls.
- However, waiting for a particular pid with FLAGS of 0 is implemented
- everywhere. (Perl emulates the system call by remembering the status
- values of processes that have exited but have not been harvested by the
- Perl script yet.)
- .Ip "wantarray" 8 4
- Returns true if the context of the currently executing subroutine
- is looking for an array value.
- Returns false if the context is looking for a scalar.
- .nf
-
- return wantarray ? () : undef;
-
- .fi
- .Ip "warn(LIST)" 8 4
- .Ip "warn LIST" 8
- Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
- .Ip "write(FILEHANDLE)" 8 6
- .Ip "write(EXPR)" 8
- .Ip "write" 8
- Writes a formatted record (possibly multi-line) to the specified file,
- using the format associated with that file.
- By default the format for a file is the one having the same name is the
- filehandle, but the format for the current output channel (see
- .IR select )
- may be set explicitly
- by assigning the name of the format to the $~ variable.
- .Sp
- Top of form processing is handled automatically:
- if there is insufficient room on the current page for the formatted
- record, the page is advanced by writing a form feed,
- a special top-of-page format is used
- to format the new page header, and then the record is written.
- By default the top-of-page format is the name of the filehandle with
- \*(L"_TOP\*(R" appended, but it may be dynamicallly set to the
- format of your choice by assigning the name to the $^ variable while
- the filehandle is selected.
- The number of lines remaining on the current page is in variable $-, which
- can be set to 0 to force a new page.
- .Sp
- If FILEHANDLE is unspecified, output goes to the current default output channel,
- which starts out as
- .I STDOUT
- but may be changed by the
- .I select
- operator.
- If the FILEHANDLE is an EXPR, then the expression is evaluated and the
- resulting string is used to look up the name of the FILEHANDLE at run time.
- For more on formats, see the section on formats later on.
- .Sp
- Note that write is NOT the opposite of read.
- .Sh "Precedence"
- .I Perl
- operators have the following associativity and precedence:
- .nf
-
- nonassoc\h'|1i'print printf exec system sort reverse
- \h'1.5i'chmod chown kill unlink utime die return
- left\h'|1i',
- right\h'|1i'= += \-= *= etc.
- right\h'|1i'?:
- nonassoc\h'|1i'.\|.
- left\h'|1i'||
- left\h'|1i'&&
- left\h'|1i'| ^
- left\h'|1i'&
- nonassoc\h'|1i'== != <=> eq ne cmp
- nonassoc\h'|1i'< > <= >= lt gt le ge
- nonassoc\h'|1i'chdir exit eval reset sleep rand umask
- nonassoc\h'|1i'\-r \-w \-x etc.
- left\h'|1i'<< >>
- left\h'|1i'+ \- .
- left\h'|1i'* / % x
- left\h'|1i'=~ !~
- right\h'|1i'! ~ and unary minus
- right\h'|1i'**
- nonassoc\h'|1i'++ \-\|\-
- left\h'|1i'\*(L'(\*(R'
-
- .fi
- As mentioned earlier, if any list operator (print, etc.) or
- any unary operator (chdir, etc.)
- is followed by a left parenthesis as the next token on the same line,
- the operator and arguments within parentheses are taken to
- be of highest precedence, just like a normal function call.
- Examples:
- .nf
-
- chdir $foo || die;\h'|3i'# (chdir $foo) || die
- chdir($foo) || die;\h'|3i'# (chdir $foo) || die
- chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
- chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
-
- but, because * is higher precedence than ||:
-
- chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
- chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
- chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
- chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
-
- rand 10 * 20;\h'|3i'# rand (10 * 20)
- rand(10) * 20;\h'|3i'# (rand 10) * 20
- rand (10) * 20;\h'|3i'# (rand 10) * 20
- rand +(10) * 20;\h'|3i'# rand (10 * 20)
-
- .fi
- In the absence of parentheses,
- the precedence of list operators such as print, sort or chmod is
- either very high or very low depending on whether you look at the left
- side of operator or the right side of it.
- For example, in
- .nf
-
- @ary = (1, 3, sort 4, 2);
- print @ary; # prints 1324
-
- .fi
- the commas on the right of the sort are evaluated before the sort, but
- the commas on the left are evaluated after.
- In other words, list operators tend to gobble up all the arguments that
- follow them, and then act like a simple term with regard to the preceding
- expression.
- Note that you have to be careful with parens:
- .nf
-
- .ne 3
- # These evaluate exit before doing the print:
- print($foo, exit); # Obviously not what you want.
- print $foo, exit; # Nor is this.
-
- .ne 4
- # These do the print before evaluating exit:
- (print $foo), exit; # This is what you want.
- print($foo), exit; # Or this.
- print ($foo), exit; # Or even this.
-
- Also note that
-
- print ($foo & 255) + 1, "\en";
-
- .fi
- probably doesn't do what you expect at first glance.
- .Sh "Subroutines"
- A subroutine may be declared as follows:
- .nf
-
- sub NAME BLOCK
-
- .fi
- .PP
- Any arguments passed to the routine come in as array @_,
- that is ($_[0], $_[1], .\|.\|.).
- The array @_ is a local array, but its values are references to the
- actual scalar parameters.
- The return value of the subroutine is the value of the last expression
- evaluated, and can be either an array value or a scalar value.
- Alternately, a return statement may be used to specify the returned value and
- exit the subroutine.
- To create local variables see the
- .I local
- operator.
- .PP
- A subroutine is called using the
- .I do
- operator or the & operator.
- .nf
-
- .ne 12
- Example:
-
- sub MAX {
- local($max) = pop(@_);
- foreach $foo (@_) {
- $max = $foo \|if \|$max < $foo;
- }
- $max;
- }
-
- .\|.\|.
- $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
-
- .ne 21
- Example:
-
- # get a line, combining continuation lines
- # that start with whitespace
- sub get_line {
- $thisline = $lookahead;
- line: while ($lookahead = <STDIN>) {
- if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
- $thisline \|.= \|$lookahead;
- }
- else {
- last line;
- }
- }
- $thisline;
- }
-
- $lookahead = <STDIN>; # get first line
- while ($_ = do get_line(\|)) {
- .\|.\|.
- }
-
- .fi
- .nf
- .ne 6
- Use array assignment to a local list to name your formal arguments:
-
- sub maybeset {
- local($key, $value) = @_;
- $foo{$key} = $value unless $foo{$key};
- }
-
- .fi
- This also has the effect of turning call-by-reference into call-by-value,
- since the assignment copies the values.
- .Sp
- Subroutines may be called recursively.
- If a subroutine is called using the & form, the argument list is optional.
- If omitted, no @_ array is set up for the subroutine; the @_ array at the
- time of the call is visible to subroutine instead.
- .nf
-
- do foo(1,2,3); # pass three arguments
- &foo(1,2,3); # the same
-
- do foo(); # pass a null list
- &foo(); # the same
- &foo; # pass no arguments\*(--more efficient
-
- .fi
- .Sh "Passing By Reference"
- Sometimes you don't want to pass the value of an array to a subroutine but
- rather the name of it, so that the subroutine can modify the global copy
- of it rather than working with a local copy.
- In perl you can refer to all the objects of a particular name by prefixing
- the name with a star: *foo.
- When evaluated, it produces a scalar value that represents all the objects
- of that name, including any filehandle, format or subroutine.
- When assigned to within a local() operation, it causes the name mentioned
- to refer to whatever * value was assigned to it.
- Example:
- .nf
-
- sub doubleary {
- local(*someary) = @_;
- foreach $elem (@someary) {
- $elem *= 2;
- }
- }
- do doubleary(*foo);
- do doubleary(*bar);
-
- .fi
- Assignment to *name is currently recommended only inside a local().
- You can actually assign to *name anywhere, but the previous referent of
- *name may be stranded forever.
- This may or may not bother you.
- .Sp
- Note that scalars are already passed by reference, so you can modify scalar
- arguments without using this mechanism by referring explicitly to the $_[nnn]
- in question.
- You can modify all the elements of an array by passing all the elements
- as scalars, but you have to use the * mechanism to push, pop or change the
- size of an array.
- The * mechanism will probably be more efficient in any case.
- .Sp
- Since a *name value contains unprintable binary data, if it is used as
- an argument in a print, or as a %s argument in a printf or sprintf, it
- then has the value '*name', just so it prints out pretty.
- .Sp
- Even if you don't want to modify an array, this mechanism is useful for
- passing multiple arrays in a single LIST, since normally the LIST mechanism
- will merge all the array values so that you can't extract out the
- individual arrays.
- .Sh "Regular Expressions"
- The patterns used in pattern matching are regular expressions such as
- those supplied in the Version 8 regexp routines.
- (In fact, the routines are derived from Henry Spencer's freely redistributable
- reimplementation of the V8 routines.)
- In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
- Word boundaries may be matched by \eb, and non-boundaries by \eB.
- A whitespace character is matched by \es, non-whitespace by \eS.
- A numeric character is matched by \ed, non-numeric by \eD.
- You may use \ew, \es and \ed within character classes.
- Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
- Within character classes \eb represents backspace rather than a word boundary.
- Alternatives may be separated by |.
- The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
- matches the digit'th substring.
- (Outside of the pattern, always use $ instead of \e in front of the digit.
- The scope of $<digit> (and $\`, $& and $\')
- extends to the end of the enclosing BLOCK or eval string, or to
- the next pattern match with subexpressions.
- The \e<digit> notation sometimes works outside the current pattern, but should
- not be relied upon.)
- You may have as many parentheses as you wish. If you have more than 9
- substrings, the variables $10, $11, ... refer to the corresponding
- substring. Within the pattern, \e10, \e11,
- etc. refer back to substrings if there have been at least that many left parens
- before the backreference. Otherwise (for backward compatibilty) \e10
- is the same as \e010, a backspace,
- and \e11 the same as \e011, a tab.
- And so on.
- (\e1 through \e9 are always backreferences.)
- .PP
- $+ returns whatever the last bracket match matched.
- $& returns the entire matched string.
- ($0 used to return the same thing, but not any more.)
- $\` returns everything before the matched string.
- $\' returns everything after the matched string.
- Examples:
- .nf
-
- s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
-
- .ne 5
- if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
- $hours = $1;
- $minutes = $2;
- $seconds = $3;
- }
-
- .fi
- By default, the ^ character is only guaranteed to match at the beginning
- of the string,
- the $ character only at the end (or before the newline at the end)
- and
- .I perl
- does certain optimizations with the assumption that the string contains
- only one line.
- The behavior of ^ and $ on embedded newlines will be inconsistent.
- You may, however, wish to treat a string as a multi-line buffer, such that
- the ^ will match after any newline within the string, and $ will match
- before any newline.
- At the cost of a little more overhead, you can do this by setting the variable
- $* to 1.
- Setting it back to 0 makes
- .I perl
- revert to its old behavior.
- .PP
- To facilitate multi-line substitutions, the . character never matches a newline
- (even when $* is 0).
- In particular, the following leaves a newline on the $_ string:
- .nf
-
- $_ = <STDIN>;
- s/.*(some_string).*/$1/;
-
- If the newline is unwanted, try one of
-
- s/.*(some_string).*\en/$1/;
- s/.*(some_string)[^\e000]*/$1/;
- s/.*(some_string)(.|\en)*/$1/;
- chop; s/.*(some_string).*/$1/;
- /(some_string)/ && ($_ = $1);
-
- .fi
- Any item of a regular expression may be followed with digits in curly brackets
- of the form {n,m}, where n gives the minimum number of times to match the item
- and m gives the maximum.
- The form {n} is equivalent to {n,n} and matches exactly n times.
- The form {n,} matches n or more times.
- (If a curly bracket occurs in any other context, it is treated as a regular
- character.)
- The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
- to {0,1}.
- There is no limit to the size of n or m, but large numbers will chew up
- more memory.
- .Sp
- You will note that all backslashed metacharacters in
- .I perl
- are alphanumeric,
- such as \eb, \ew, \en.
- Unlike some other regular expression languages, there are no backslashed
- symbols that aren't alphanumeric.
- So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
- interpreted as a literal character, not a metacharacter.
- This makes it simple to quote a string that you want to use for a pattern
- but that you are afraid might contain metacharacters.
- Simply quote all the non-alphanumeric characters:
- .nf
-
- $pattern =~ s/(\eW)/\e\e$1/g;
-
- .fi
- .Sh "Formats"
- Output record formats for use with the
- .I write
- operator may declared as follows:
- .nf
-
- .ne 3
- format NAME =
- FORMLIST
- .
-
- .fi
- If name is omitted, format \*(L"STDOUT\*(R" is defined.
- FORMLIST consists of a sequence of lines, each of which may be of one of three
- types:
- .Ip 1. 4
- A comment.
- .Ip 2. 4
- A \*(L"picture\*(R" line giving the format for one output line.
- .Ip 3. 4
- An argument line supplying values to plug into a picture line.
- .PP
- Picture lines are printed exactly as they look, except for certain fields
- that substitute values into the line.
- Each picture field starts with either @ or ^.
- The @ field (not to be confused with the array marker @) is the normal
- case; ^ fields are used
- to do rudimentary multi-line text block filling.
- The length of the field is supplied by padding out the field
- with multiple <, >, or | characters to specify, respectively, left justification,
- right justification, or centering.
- As an alternate form of right justification,
- you may also use # characters (with an optional .) to specify a numeric field.
- (Use of ^ instead of @ causes the field to be blanked if undefined.)
- If any of the values supplied for these fields contains a newline, only
- the text up to the newline is printed.
- The special field @* can be used for printing multi-line values.
- It should appear by itself on a line.
- .PP
- The values are specified on the following line, in the same order as
- the picture fields.
- The values should be separated by commas.
- .PP
- Picture fields that begin with ^ rather than @ are treated specially.
- The value supplied must be a scalar variable name which contains a text
- string.
- .I Perl
- puts as much text as it can into the field, and then chops off the front
- of the string so that the next time the variable is referenced,
- more of the text can be printed.
- Normally you would use a sequence of fields in a vertical stack to print
- out a block of text.
- If you like, you can end the final field with .\|.\|., which will appear in the
- output if the text was too long to appear in its entirety.
- You can change which characters are legal to break on by changing the
- variable $: to a list of the desired characters.
- .PP
- Since use of ^ fields can produce variable length records if the text to be
- formatted is short, you can suppress blank lines by putting the tilde (~)
- character anywhere in the line.
- (Normally you should put it in the front if possible, for visibility.)
- The tilde will be translated to a space upon output.
- If you put a second tilde contiguous to the first, the line will be repeated
- until all the fields on the line are exhausted.
- (If you use a field of the @ variety, the expression you supply had better
- not give the same value every time forever!)
- .PP
- Examples:
- .nf
- .lg 0
- .cs R 25
- .ft C
-
- .ne 10
- # a report on the /etc/passwd file
- format STDOUT_TOP =
- \& Passwd File
- Name Login Office Uid Gid Home
- ------------------------------------------------------------------
- \&.
- format STDOUT =
- @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
- $name, $login, $office,$uid,$gid, $home
- \&.
-
- .ne 29
- # a report from a bug report form
- format STDOUT_TOP =
- \& Bug Reports
- @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
- $system, $%, $date
- ------------------------------------------------------------------
- \&.
- format STDOUT =
- Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $subject
- Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $index, $description
- Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $priority, $date, $description
- From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $from, $description
- Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $programmer, $description
- \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $description
- \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $description
- \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $description
- \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
- \& $description
- \&~ ^<<<<<<<<<<<<<<<<<<<<<<<...
- \& $description
- \&.
-
- .ft R
- .cs R
- .lg
- .fi
- It is possible to intermix prints with writes on the same output channel,
- but you'll have to handle $\- (lines left on the page) yourself.
- .PP
- If you are printing lots of fields that are usually blank, you should consider
- using the reset operator between records.
- Not only is it more efficient, but it can prevent the bug of adding another
- field and forgetting to zero it.
- .Sh "Interprocess Communication"
- The IPC facilities of perl are built on the Berkeley socket mechanism.
- If you don't have sockets, you can ignore this section.
- The calls have the same names as the corresponding system calls,
- but the arguments tend to differ, for two reasons.
- First, perl file handles work differently than C file descriptors.
- Second, perl already knows the length of its strings, so you don't need
- to pass that information.
- Here is a sample client (untested):
- .nf
-
- ($them,$port) = @ARGV;
- $port = 2345 unless $port;
- $them = 'localhost' unless $them;
-
- $SIG{'INT'} = 'dokill';
- sub dokill { kill 9,$child if $child; }
-
- require 'sys/socket.ph';
-
- $sockaddr = 'S n a4 x8';
- chop($hostname = `hostname`);
-
- ($name, $aliases, $proto) = getprotobyname('tcp');
- ($name, $aliases, $port) = getservbyname($port, 'tcp')
- unless $port =~ /^\ed+$/;
- .ie t \{\
- ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
- 'br\}
- .el \{\
- ($name, $aliases, $type, $len, $thisaddr) =
- gethostbyname($hostname);
- 'br\}
- ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
-
- $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
- $that = pack($sockaddr, &AF_INET, $port, $thataddr);
-
- socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
- bind(S, $this) || die "bind: $!";
- connect(S, $that) || die "connect: $!";
-
- select(S); $| = 1; select(stdout);
-
- if ($child = fork) {
- while (<>) {
- print S;
- }
- sleep 3;
- do dokill();
- }
- else {
- while (<S>) {
- print;
- }
- }
-
- .fi
- And here's a server:
- .nf
-
- ($port) = @ARGV;
- $port = 2345 unless $port;
-
- require 'sys/socket.ph';
-
- $sockaddr = 'S n a4 x8';
-
- ($name, $aliases, $proto) = getprotobyname('tcp');
- ($name, $aliases, $port) = getservbyname($port, 'tcp')
- unless $port =~ /^\ed+$/;
-
- $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
-
- select(NS); $| = 1; select(stdout);
-
- socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
- bind(S, $this) || die "bind: $!";
- listen(S, 5) || die "connect: $!";
-
- select(S); $| = 1; select(stdout);
-
- for (;;) {
- print "Listening again\en";
- ($addr = accept(NS,S)) || die $!;
- print "accept ok\en";
-
- ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
- @inetaddr = unpack('C4',$inetaddr);
- print "$af $port @inetaddr\en";
-
- while (<NS>) {
- print;
- print NS;
- }
- }
-
- .fi
- .Sh "Predefined Names"
- The following names have special meaning to
- .IR perl .
- I could have used alphabetic symbols for some of these, but I didn't want
- to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
- out.
- You'll just have to suffer along with these silly symbols.
- Most of them have reasonable mnemonics, or analogues in one of the shells.
- .Ip $_ 8
- The default input and pattern-searching space.
- The following pairs are equivalent:
- .nf
-
- .ne 2
- while (<>) {\|.\|.\|. # only equivalent in while!
- while ($_ = <>) {\|.\|.\|.
-
- .ne 2
- /\|^Subject:/
- $_ \|=~ \|/\|^Subject:/
-
- .ne 2
- y/a\-z/A\-Z/
- $_ =~ y/a\-z/A\-Z/
-
- .ne 2
- chop
- chop($_)
-
- .fi
- (Mnemonic: underline is understood in certain operations.)
- .Ip $. 8
- The current input line number of the last filehandle that was read.
- Readonly.
- Remember that only an explicit close on the filehandle resets the line number.
- Since <> never does an explicit close, line numbers increase across ARGV files
- (but see examples under eof).
- (Mnemonic: many programs use . to mean the current line number.)
- .Ip $/ 8
- The input record separator, newline by default.
- Works like
- .IR awk 's
- RS variable, including treating blank lines as delimiters
- if set to the null string.
- You may set it to a multicharacter string to match a multi-character
- delimiter.
- (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
- .Ip $, 8
- The output field separator for the print operator.
- Ordinarily the print operator simply prints out the comma separated fields
- you specify.
- In order to get behavior more like
- .IR awk ,
- set this variable as you would set
- .IR awk 's
- OFS variable to specify what is printed between fields.
- (Mnemonic: what is printed when there is a , in your print statement.)
- .Ip $"" 8
- This is like $, except that it applies to array values interpolated into
- a double-quoted string (or similar interpreted string).
- Default is a space.
- (Mnemonic: obvious, I think.)
- .Ip $\e 8
- The output record separator for the print operator.
- Ordinarily the print operator simply prints out the comma separated fields
- you specify, with no trailing newline or record separator assumed.
- In order to get behavior more like
- .IR awk ,
- set this variable as you would set
- .IR awk 's
- ORS variable to specify what is printed at the end of the print.
- (Mnemonic: you set $\e instead of adding \en at the end of the print.
- Also, it's just like /, but it's what you get \*(L"back\*(R" from
- .IR perl .)
- .Ip $# 8
- The output format for printed numbers.
- This variable is a half-hearted attempt to emulate
- .IR awk 's
- OFMT variable.
- There are times, however, when
- .I awk
- and
- .I perl
- have differing notions of what
- is in fact numeric.
- Also, the initial value is %.20g rather than %.6g, so you need to set $#
- explicitly to get
- .IR awk 's
- value.
- (Mnemonic: # is the number sign.)
- .Ip $% 8
- The current page number of the currently selected output channel.
- (Mnemonic: % is page number in nroff.)
- .Ip $= 8
- The current page length (printable lines) of the currently selected output
- channel.
- Default is 60.
- (Mnemonic: = has horizontal lines.)
- .Ip $\- 8
- The number of lines left on the page of the currently selected output channel.
- (Mnemonic: lines_on_page \- lines_printed.)
- .Ip $~ 8
- The name of the current report format for the currently selected output
- channel.
- Default is name of the filehandle.
- (Mnemonic: brother to $^.)
- .Ip $^ 8
- The name of the current top-of-page format for the currently selected output
- channel.
- Default is name of the filehandle with \*(L"_TOP\*(R" appended.
- (Mnemonic: points to top of page.)
- .Ip $| 8
- If set to nonzero, forces a flush after every write or print on the currently
- selected output channel.
- Default is 0.
- Note that
- .I STDOUT
- will typically be line buffered if output is to the
- terminal and block buffered otherwise.
- Setting this variable is useful primarily when you are outputting to a pipe,
- such as when you are running a
- .I perl
- script under rsh and want to see the
- output as it's happening.
- (Mnemonic: when you want your pipes to be piping hot.)
- .Ip $$ 8
- The process number of the
- .I perl
- running this script.
- (Mnemonic: same as shells.)
- .Ip $? 8
- The status returned by the last pipe close, backtick (\`\`) command or
- .I system
- operator.
- Note that this is the status word returned by the wait() system
- call, so the exit value of the subprocess is actually ($? >> 8).
- $? & 255 gives which signal, if any, the process died from, and whether
- there was a core dump.
- (Mnemonic: similar to sh and ksh.)
- .Ip $& 8 4
- The string matched by the last pattern match (not counting any matches hidden
- within a BLOCK or eval enclosed by the current BLOCK).
- (Mnemonic: like & in some editors.)
- .Ip $\` 8 4
- The string preceding whatever was matched by the last pattern match
- (not counting any matches hidden within a BLOCK or eval enclosed by the current
- BLOCK).
- (Mnemonic: \` often precedes a quoted string.)
- .Ip $\' 8 4
- The string following whatever was matched by the last pattern match
- (not counting any matches hidden within a BLOCK or eval enclosed by the current
- BLOCK).
- (Mnemonic: \' often follows a quoted string.)
- Example:
- .nf
-
- .ne 3
- $_ = \'abcdefghi\';
- /def/;
- print "$\`:$&:$\'\en"; # prints abc:def:ghi
-
- .fi
- .Ip $+ 8 4
- The last bracket matched by the last search pattern.
- This is useful if you don't know which of a set of alternative patterns
- matched.
- For example:
- .nf
-
- /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
-
- .fi
- (Mnemonic: be positive and forward looking.)
- .Ip $* 8 2
- Set to 1 to do multiline matching within a string, 0 to tell
- .I perl
- that it can assume that strings contain a single line, for the purpose
- of optimizing pattern matches.
- Pattern matches on strings containing multiple newlines can produce confusing
- results when $* is 0.
- Default is 0.
- (Mnemonic: * matches multiple things.)
- Note that this variable only influences the interpretation of ^ and $.
- A literal newline can be searched for even when $* == 0.
- .Ip $0 8
- Contains the name of the file containing the
- .I perl
- script being executed.
- Assigning to $0 modifies the argument area that the ps(1) program sees.
- (Mnemonic: same as sh and ksh.)
- .Ip $<digit> 8
- Contains the subpattern from the corresponding set of parentheses in the last
- pattern matched, not counting patterns matched in nested blocks that have
- been exited already.
- (Mnemonic: like \edigit.)
- .Ip $[ 8 2
- The index of the first element in an array, and of the first character in
- a substring.
- Default is 0, but you could set it to 1 to make
- .I perl
- behave more like
- .I awk
- (or Fortran)
- when subscripting and when evaluating the index() and substr() functions.
- (Mnemonic: [ begins subscripts.)
- .Ip $] 8 2
- The string printed out when you say \*(L"perl -v\*(R".
- It can be used to determine at the beginning of a script whether the perl
- interpreter executing the script is in the right range of versions.
- If used in a numeric context, returns the version + patchlevel / 1000.
- Example:
- .nf
-
- .ne 8
- # see if getc is available
- ($version,$patchlevel) =
- $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
- print STDERR "(No filename completion available.)\en"
- if $version * 1000 + $patchlevel < 2016;
-
- or, used numerically,
-
- warn "No checksumming!\en" if $] < 3.019;
-
- .fi
- (Mnemonic: Is this version of perl in the right bracket?)
- .Ip $; 8 2
- The subscript separator for multi-dimensional array emulation.
- If you refer to an associative array element as
- .nf
- $foo{$a,$b,$c}
-
- it really means
-
- $foo{join($;, $a, $b, $c)}
-
- But don't put
-
- @foo{$a,$b,$c} # a slice\*(--note the @
-
- which means
-
- ($foo{$a},$foo{$b},$foo{$c})
-
- .fi
- Default is "\e034", the same as SUBSEP in
- .IR awk .
- Note that if your keys contain binary data there might not be any safe
- value for $;.
- (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
- Yeah, I know, it's pretty lame, but $, is already taken for something more
- important.)
- .Ip $! 8 2
- If used in a numeric context, yields the current value of errno, with all the
- usual caveats.
- (This means that you shouldn't depend on the value of $! to be anything
- in particular unless you've gotten a specific error return indicating a
- system error.)
- If used in a string context, yields the corresponding system error string.
- You can assign to $! in order to set errno
- if, for instance, you want $! to return the string for error n, or you want
- to set the exit value for the die operator.
- (Mnemonic: What just went bang?)
- .Ip $@ 8 2
- The perl syntax error message from the last eval command.
- If null, the last eval parsed and executed correctly (although the operations
- you invoked may have failed in the normal fashion).
- (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
- .Ip $< 8 2
- The real uid of this process.
- (Mnemonic: it's the uid you came FROM, if you're running setuid.)
- .Ip $> 8 2
- The effective uid of this process.
- Example:
- .nf
-
- .ne 2
- $< = $>; # set real uid to the effective uid
- ($<,$>) = ($>,$<); # swap real and effective uid
-
- .fi
- (Mnemonic: it's the uid you went TO, if you're running setuid.)
- Note: $< and $> can only be swapped on machines supporting setreuid().
- .Ip $( 8 2
- The real gid of this process.
- If you are on a machine that supports membership in multiple groups
- simultaneously, gives a space separated list of groups you are in.
- The first number is the one returned by getgid(), and the subsequent ones
- by getgroups(), one of which may be the same as the first number.
- (Mnemonic: parentheses are used to GROUP things.
- The real gid is the group you LEFT, if you're running setgid.)
- .Ip $) 8 2
- The effective gid of this process.
- If you are on a machine that supports membership in multiple groups
- simultaneously, gives a space separated list of groups you are in.
- The first number is the one returned by getegid(), and the subsequent ones
- by getgroups(), one of which may be the same as the first number.
- (Mnemonic: parentheses are used to GROUP things.
- The effective gid is the group that's RIGHT for you, if you're running setgid.)
- .Sp
- Note: $<, $>, $( and $) can only be set on machines that support the
- corresponding set[re][ug]id() routine.
- $( and $) can only be swapped on machines supporting setregid().
- .Ip $: 8 2
- The current set of characters after which a string may be broken to
- fill continuation fields (starting with ^) in a format.
- Default is "\ \en-", to break on whitespace or hyphens.
- (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
- .Ip $^D 8 2
- The current value of the debugging flags.
- (Mnemonic: value of
- .B \-D
- switch.)
- .Ip $^F 8 2
- The maximum system file descriptor, ordinarily 2. System file descriptors
- are passed to subprocesses, while higher file descriptors are not.
- During an open, system file descriptors are preserved even if the open
- fails. Ordinary file descriptors are closed before the open is attempted.
- .Ip $^I 8 2
- The current value of the inplace-edit extension.
- Use undef to disable inplace editing.
- (Mnemonic: value of
- .B \-i
- switch.)
- .Ip $^P 8 2
- The internal flag that the debugger clears so that it doesn't
- debug itself. You could conceivable disable debugging yourself
- by clearing it.
- .Ip $^T 8 2
- The time at which the script began running, in seconds since the epoch.
- The values returned by the
- .B \-M ,
- .B \-A
- and
- .B \-C
- filetests are based on this value.
- .Ip $^W 8 2
- The current value of the warning switch.
- (Mnemonic: related to the
- .B \-w
- switch.)
- .Ip $^X 8 2
- The name that Perl itself was executed as, from argv[0].
- .Ip $ARGV 8 3
- contains the name of the current file when reading from <>.
- .Ip @ARGV 8 3
- The array ARGV contains the command line arguments intended for the script.
- Note that $#ARGV is the generally number of arguments minus one, since
- $ARGV[0] is the first argument, NOT the command name.
- See $0 for the command name.
- .Ip @INC 8 3
- The array INC contains the list of places to look for
- .I perl
- scripts to be
- evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
- It initially consists of the arguments to any
- .B \-I
- command line switches, followed
- by the default
- .I perl
- library, probably \*(L"/usr/local/lib/perl\*(R",
- followed by \*(L".\*(R", to represent the current directory.
- .Ip %INC 8 3
- The associative array INC contains entries for each filename that has
- been included via \*(L"do\*(R" or \*(L"require\*(R".
- The key is the filename you specified, and the value is the location of
- the file actually found.
- The \*(L"require\*(R" command uses this array to determine whether
- a given file has already been included.
- .Ip $ENV{expr} 8 2
- The associative array ENV contains your current environment.
- Setting a value in ENV changes the environment for child processes.
- .Ip $SIG{expr} 8 2
- The associative array SIG is used to set signal handlers for various signals.
- Example:
- .nf
-
- .ne 12
- sub handler { # 1st argument is signal name
- local($sig) = @_;
- print "Caught a SIG$sig\-\|\-shutting down\en";
- close(LOG);
- exit(0);
- }
-
- $SIG{\'INT\'} = \'handler\';
- $SIG{\'QUIT\'} = \'handler\';
- .\|.\|.
- $SIG{\'INT\'} = \'DEFAULT\'; # restore default action
- $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT
-
- .fi
- The SIG array only contains values for the signals actually set within
- the perl script.
- .Sh "Packages"
- Perl provides a mechanism for alternate namespaces to protect packages from
- stomping on each others variables.
- By default, a perl script starts compiling into the package known as \*(L"main\*(R".
- By use of the
- .I package
- declaration, you can switch namespaces.
- The scope of the package declaration is from the declaration itself to the end
- of the enclosing block (the same scope as the local() operator).
- Typically it would be the first declaration in a file to be included by
- the \*(L"require\*(R" operator.
- You can switch into a package in more than one place; it merely influences
- which symbol table is used by the compiler for the rest of that block.
- You can refer to variables and filehandles in other packages by prefixing
- the identifier with the package name and a single quote.
- If the package name is null, the \*(L"main\*(R" package as assumed.
- .PP
- Only identifiers starting with letters are stored in the packages symbol
- table.
- All other symbols are kept in package \*(L"main\*(R".
- In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
- and SIG are forced to be in package \*(L"main\*(R", even when used for
- other purposes than their built-in one.
- Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
- or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
- will be interpreted instead as a pattern match, a substitution
- or a translation.
- .PP
- Eval'ed strings are compiled in the package in which the eval was compiled
- in.
- (Assignments to $SIG{}, however, assume the signal handler specified is in the
- main package.
- Qualify the signal handler name if you wish to have a signal handler in
- a package.)
- For an example, examine perldb.pl in the perl library.
- It initially switches to the DB package so that the debugger doesn't interfere
- with variables in the script you are trying to debug.
- At various points, however, it temporarily switches back to the main package
- to evaluate various expressions in the context of the main package.
- .PP
- The symbol table for a package happens to be stored in the associative array
- of that name prepended with an underscore.
- The value in each entry of the associative array is
- what you are referring to when you use the *name notation.
- In fact, the following have the same effect (in package main, anyway),
- though the first is more
- efficient because it does the symbol table lookups at compile time:
- .nf
-
- .ne 2
- local(*foo) = *bar;
- local($_main{'foo'}) = $_main{'bar'};
-
- .fi
- You can use this to print out all the variables in a package, for instance.
- Here is dumpvar.pl from the perl library:
- .nf
- .ne 11
- package dumpvar;
-
- sub main'dumpvar {
- \& ($package) = @_;
- \& local(*stab) = eval("*_$package");
- \& while (($key,$val) = each(%stab)) {
- \& {
- \& local(*entry) = $val;
- \& if (defined $entry) {
- \& print "\e$$key = '$entry'\en";
- \& }
- .ne 7
- \& if (defined @entry) {
- \& print "\e@$key = (\en";
- \& foreach $num ($[ .. $#entry) {
- \& print " $num\et'",$entry[$num],"'\en";
- \& }
- \& print ")\en";
- \& }
- .ne 10
- \& if ($key ne "_$package" && defined %entry) {
- \& print "\e%$key = (\en";
- \& foreach $key (sort keys(%entry)) {
- \& print " $key\et'",$entry{$key},"'\en";
- \& }
- \& print ")\en";
- \& }
- \& }
- \& }
- }
-
- .fi
- Note that, even though the subroutine is compiled in package dumpvar, the
- name of the subroutine is qualified so that its name is inserted into package
- \*(L"main\*(R".
- .Sh "Style"
- Each programmer will, of course, have his or her own preferences in regards
- to formatting, but there are some general guidelines that will make your
- programs easier to read.
- .Ip 1. 4 4
- Just because you CAN do something a particular way doesn't mean that
- you SHOULD do it that way.
- .I Perl
- is designed to give you several ways to do anything, so consider picking
- the most readable one.
- For instance
-
- open(FOO,$foo) || die "Can't open $foo: $!";
-
- is better than
-
- die "Can't open $foo: $!" unless open(FOO,$foo);
-
- because the second way hides the main point of the statement in a
- modifier.
- On the other hand
-
- print "Starting analysis\en" if $verbose;
-
- is better than
-
- $verbose && print "Starting analysis\en";
-
- since the main point isn't whether the user typed -v or not.
- .Sp
- Similarly, just because an operator lets you assume default arguments
- doesn't mean that you have to make use of the defaults.
- The defaults are there for lazy systems programmers writing one-shot
- programs.
- If you want your program to be readable, consider supplying the argument.
- .Sp
- Along the same lines, just because you
- .I can
- omit parentheses in many places doesn't mean that you ought to:
- .nf
-
- return print reverse sort num values array;
- return print(reverse(sort num (values(%array))));
-
- .fi
- When in doubt, parenthesize.
- At the very least it will let some poor schmuck bounce on the % key in vi.
- .Sp
- Even if you aren't in doubt, consider the mental welfare of the person who
- has to maintain the code after you, and who will probably put parens in
- the wrong place.
- .Ip 2. 4 4
- Don't go through silly contortions to exit a loop at the top or the
- bottom, when
- .I perl
- provides the "last" operator so you can exit in the middle.
- Just outdent it a little to make it more visible:
- .nf
-
- .ne 7
- line:
- for (;;) {
- statements;
- last line if $foo;
- next line if /^#/;
- statements;
- }
-
- .fi
- .Ip 3. 4 4
- Don't be afraid to use loop labels\*(--they're there to enhance readability as
- well as to allow multi-level loop breaks.
- See last example.
- .Ip 4. 4 4
- For portability, when using features that may not be implemented on every
- machine, test the construct in an eval to see if it fails.
- If you know what version or patchlevel a particular feature was implemented,
- you can test $] to see if it will be there.
- .Ip 5. 4 4
- Choose mnemonic identifiers.
- .Ip 6. 4 4
- Be consistent.
- .Sh "Debugging"
- If you invoke
- .I perl
- with a
- .B \-d
- switch, your script will be run under a debugging monitor.
- It will halt before the first executable statement and ask you for a
- command, such as:
- .Ip "h" 12 4
- Prints out a help message.
- .Ip "T" 12 4
- Stack trace.
- .Ip "s" 12 4
- Single step.
- Executes until it reaches the beginning of another statement.
- .Ip "n" 12 4
- Next.
- Executes over subroutine calls, until it reaches the beginning of the
- next statement.
- .Ip "f" 12 4
- Finish.
- Executes statements until it has finished the current subroutine.
- .Ip "c" 12 4
- Continue.
- Executes until the next breakpoint is reached.
- .Ip "c line" 12 4
- Continue to the specified line.
- Inserts a one-time-only breakpoint at the specified line.
- .Ip "<CR>" 12 4
- Repeat last n or s.
- .Ip "l min+incr" 12 4
- List incr+1 lines starting at min.
- If min is omitted, starts where last listing left off.
- If incr is omitted, previous value of incr is used.
- .Ip "l min-max" 12 4
- List lines in the indicated range.
- .Ip "l line" 12 4
- List just the indicated line.
- .Ip "l" 12 4
- List next window.
- .Ip "-" 12 4
- List previous window.
- .Ip "w line" 12 4
- List window around line.
- .Ip "l subname" 12 4
- List subroutine.
- If it's a long subroutine it just lists the beginning.
- Use \*(L"l\*(R" to list more.
- .Ip "/pattern/" 12 4
- Regular expression search forward for pattern; the final / is optional.
- .Ip "?pattern?" 12 4
- Regular expression search backward for pattern; the final ? is optional.
- .Ip "L" 12 4
- List lines that have breakpoints or actions.
- .Ip "S" 12 4
- Lists the names of all subroutines.
- .Ip "t" 12 4
- Toggle trace mode on or off.
- .Ip "b line condition" 12 4
- Set a breakpoint.
- If line is omitted, sets a breakpoint on the
- line that is about to be executed.
- If a condition is specified, it is evaluated each time the statement is
- reached and a breakpoint is taken only if the condition is true.
- Breakpoints may only be set on lines that begin an executable statement.
- .Ip "b subname condition" 12 4
- Set breakpoint at first executable line of subroutine.
- .Ip "d line" 12 4
- Delete breakpoint.
- If line is omitted, deletes the breakpoint on the
- line that is about to be executed.
- .Ip "D" 12 4
- Delete all breakpoints.
- .Ip "a line command" 12 4
- Set an action for line.
- A multi-line command may be entered by backslashing the newlines.
- .Ip "A" 12 4
- Delete all line actions.
- .Ip "< command" 12 4
- Set an action to happen before every debugger prompt.
- A multi-line command may be entered by backslashing the newlines.
- .Ip "> command" 12 4
- Set an action to happen after the prompt when you've just given a command
- to return to executing the script.
- A multi-line command may be entered by backslashing the newlines.
- .Ip "V package" 12 4
- List all variables in package.
- Default is main package.
- .Ip "! number" 12 4
- Redo a debugging command.
- If number is omitted, redoes the previous command.
- .Ip "! -number" 12 4
- Redo the command that was that many commands ago.
- .Ip "H -number" 12 4
- Display last n commands.
- Only commands longer than one character are listed.
- If number is omitted, lists them all.
- .Ip "q or ^D" 12 4
- Quit.
- .Ip "command" 12 4
- Execute command as a perl statement.
- A missing semicolon will be supplied.
- .Ip "p expr" 12 4
- Same as \*(L"print DB'OUT expr\*(R".
- The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
- may be redirected to.
- .PP
- If you want to modify the debugger, copy perldb.pl from the perl library
- to your current directory and modify it as necessary.
- (You'll also have to put -I. on your command line.)
- You can do some customization by setting up a .perldb file which contains
- initialization code.
- For instance, you could make aliases like these:
- .nf
-
- $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
- $DB'alias{'stop'} = 's/^stop (at|in)/b/';
- $DB'alias{'.'} =
- 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
-
- .fi
- .Sh "Setuid Scripts"
- .I Perl
- is designed to make it easy to write secure setuid and setgid scripts.
- Unlike shells, which are based on multiple substitution passes on each line
- of the script,
- .I perl
- uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
- Additionally, since the language has more built-in functionality, it
- has to rely less upon external (and possibly untrustworthy) programs to
- accomplish its purposes.
- .PP
- In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
- insecure, but this kernel feature can be disabled.
- If it is,
- .I perl
- can emulate the setuid and setgid mechanism when it notices the otherwise
- useless setuid/gid bits on perl scripts.
- If the kernel feature isn't disabled,
- .I perl
- will complain loudly that your setuid script is insecure.
- You'll need to either disable the kernel setuid script feature, or put
- a C wrapper around the script.
- .PP
- When perl is executing a setuid script, it takes special precautions to
- prevent you from falling into any obvious traps.
- (In some ways, a perl script is more secure than the corresponding
- C program.)
- Any command line argument, environment variable, or input is marked as
- \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
- command that invokes a subshell, or in any command that modifies files,
- directories or processes.
- Any variable that is set within an expression that has previously referenced
- a tainted value also becomes tainted (even if it is logically impossible
- for the tainted value to influence the variable).
- For example:
- .nf
-
- .ne 5
- $foo = shift; # $foo is tainted
- $bar = $foo,\'bar\'; # $bar is also tainted
- $xxx = <>; # Tainted
- $path = $ENV{\'PATH\'}; # Tainted, but see below
- $abc = \'abc\'; # Not tainted
-
- .ne 4
- system "echo $foo"; # Insecure
- system "/bin/echo", $foo; # Secure (doesn't use sh)
- system "echo $bar"; # Insecure
- system "echo $abc"; # Insecure until PATH set
-
- .ne 5
- $ENV{\'PATH\'} = \'/bin:/usr/bin\';
- $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
-
- $path = $ENV{\'PATH\'}; # Not tainted
- system "echo $abc"; # Is secure now!
-
- .ne 5
- open(FOO,"$foo"); # OK
- open(FOO,">$foo"); # Not OK
-
- open(FOO,"echo $foo|"); # Not OK, but...
- open(FOO,"-|") || exec \'echo\', $foo; # OK
-
- $zzz = `echo $foo`; # Insecure, zzz tainted
-
- unlink $abc,$foo; # Insecure
- umask $foo; # Insecure
-
- .ne 3
- exec "echo $foo"; # Insecure
- exec "echo", $foo; # Secure (doesn't use sh)
- exec "sh", \'-c\', $foo; # Considered secure, alas
-
- .fi
- The taintedness is associated with each scalar value, so some elements
- of an array can be tainted, and others not.
- .PP
- If you try to do something insecure, you will get a fatal error saying
- something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
- Note that you can still write an insecure system call or exec,
- but only by explicitly doing something like the last example above.
- You can also bypass the tainting mechanism by referencing
- subpatterns\*(--\c
- .I perl
- presumes that if you reference a substring using $1, $2, etc, you knew
- what you were doing when you wrote the pattern:
- .nf
-
- $ARGV[0] =~ /^\-P(\ew+)$/;
- $printer = $1; # Not tainted
-
- .fi
- This is fairly secure since \ew+ doesn't match shell metacharacters.
- Use of .+ would have been insecure, but
- .I perl
- doesn't check for that, so you must be careful with your patterns.
- This is the ONLY mechanism for untainting user supplied filenames if you
- want to do file operations on them (unless you make $> equal to $<).
- .PP
- It's also possible to get into trouble with other operations that don't care
- whether they use tainted values.
- Make judicious use of the file tests in dealing with any user-supplied
- filenames.
- When possible, do opens and such after setting $> = $<.
- .I Perl
- doesn't prevent you from opening tainted filenames for reading, so be
- careful what you print out.
- The tainting mechanism is intended to prevent stupid mistakes, not to remove
- the need for thought.
- .SH ENVIRONMENT
- .I Perl
- uses PATH in executing subprocesses, and in finding the script if \-S
- is used.
- HOME or LOGDIR are used if chdir has no argument.
- .PP
- Apart from these,
- .I perl
- uses no environment variables, except to make them available
- to the script being executed, and to child processes.
- However, scripts running setuid would do well to execute the following lines
- before doing anything else, just to keep people honest:
- .nf
-
- .ne 3
- $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need
- $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
- $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
-
- .fi
- .SH AUTHOR
- Larry Wall <lwall@netlabs.com>
- .br
- MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
- .SH FILES
- /tmp/perl\-eXXXXXX temporary file for
- .B \-e
- commands.
- .SH SEE ALSO
- a2p awk to perl translator
- .br
- s2p sed to perl translator
- .SH DIAGNOSTICS
- Compilation errors will tell you the line number of the error, with an
- indication of the next token or token type that was to be examined.
- (In the case of a script passed to
- .I perl
- via
- .B \-e
- switches, each
- .B \-e
- is counted as one line.)
- .PP
- Setuid scripts have additional constraints that can produce error messages
- such as \*(L"Insecure dependency\*(R".
- See the section on setuid scripts.
- .SH TRAPS
- Accustomed
- .IR awk
- users should take special note of the following:
- .Ip * 4 2
- Semicolons are required after all simple statements in
- .IR perl .
- Newline
- is not a statement delimiter.
- .Ip * 4 2
- Curly brackets are required on ifs and whiles.
- .Ip * 4 2
- Variables begin with $ or @ in
- .IR perl .
- .Ip * 4 2
- Arrays index from 0 unless you set $[.
- Likewise string positions in substr() and index().
- .Ip * 4 2
- You have to decide whether your array has numeric or string indices.
- .Ip * 4 2
- Associative array values do not spring into existence upon mere reference.
- .Ip * 4 2
- You have to decide whether you want to use string or numeric comparisons.
- .Ip * 4 2
- Reading an input line does not split it for you. You get to split it yourself
- to an array.
- And the
- .I split
- operator has different arguments.
- .Ip * 4 2
- The current input line is normally in $_, not $0.
- It generally does not have the newline stripped.
- ($0 is the name of the program executed.)
- .Ip * 4 2
- $<digit> does not refer to fields\*(--it refers to substrings matched by the last
- match pattern.
- .Ip * 4 2
- The
- .I print
- statement does not add field and record separators unless you set
- $, and $\e.
- .Ip * 4 2
- You must open your files before you print to them.
- .Ip * 4 2
- The range operator is \*(L".\|.\*(R", not comma.
- (The comma operator works as in C.)
- .Ip * 4 2
- The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
- (\*(L"~\*(R" is the one's complement operator, as in C.)
- .Ip * 4 2
- The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
- (\*(L"^\*(R" is the XOR operator, as in C.)
- .Ip * 4 2
- The concatenation operator is \*(L".\*(R", not the null string.
- (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
- since the third slash would be interpreted as a division operator\*(--the
- tokener is in fact slightly context sensitive for operators like /, ?, and <.
- And in fact, . itself can be the beginning of a number.)
- .Ip * 4 2
- .IR Next ,
- .I exit
- and
- .I continue
- work differently.
- .Ip * 4 2
- The following variables work differently
- .nf
-
- Awk \h'|2.5i'Perl
- ARGC \h'|2.5i'$#ARGV
- ARGV[0] \h'|2.5i'$0
- FILENAME\h'|2.5i'$ARGV
- FNR \h'|2.5i'$. \- something
- FS \h'|2.5i'(whatever you like)
- NF \h'|2.5i'$#Fld, or some such
- NR \h'|2.5i'$.
- OFMT \h'|2.5i'$#
- OFS \h'|2.5i'$,
- ORS \h'|2.5i'$\e
- RLENGTH \h'|2.5i'length($&)
- RS \h'|2.5i'$/
- RSTART \h'|2.5i'length($\`)
- SUBSEP \h'|2.5i'$;
-
- .fi
- .Ip * 4 2
- When in doubt, run the
- .I awk
- construct through a2p and see what it gives you.
- .PP
- Cerebral C programmers should take note of the following:
- .Ip * 4 2
- Curly brackets are required on ifs and whiles.
- .Ip * 4 2
- You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
- .Ip * 4 2
- .I Break
- and
- .I continue
- become
- .I last
- and
- .IR next ,
- respectively.
- .Ip * 4 2
- There's no switch statement.
- .Ip * 4 2
- Variables begin with $ or @ in
- .IR perl .
- .Ip * 4 2
- Printf does not implement *.
- .Ip * 4 2
- Comments begin with #, not /*.
- .Ip * 4 2
- You can't take the address of anything.
- .Ip * 4 2
- ARGV must be capitalized.
- .Ip * 4 2
- The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
- .Ip * 4 2
- Signal handlers deal with signal names, not numbers.
- .PP
- Seasoned
- .I sed
- programmers should take note of the following:
- .Ip * 4 2
- Backreferences in substitutions use $ rather than \e.
- .Ip * 4 2
- The pattern matching metacharacters (, ), and | do not have backslashes in front.
- .Ip * 4 2
- The range operator is .\|. rather than comma.
- .PP
- Sharp shell programmers should take note of the following:
- .Ip * 4 2
- The backtick operator does variable interpretation without regard to the
- presence of single quotes in the command.
- .Ip * 4 2
- The backtick operator does no translation of the return value, unlike csh.
- .Ip * 4 2
- Shells (especially csh) do several levels of substitution on each command line.
- .I Perl
- does substitution only in certain constructs such as double quotes,
- backticks, angle brackets and search patterns.
- .Ip * 4 2
- Shells interpret scripts a little bit at a time.
- .I Perl
- compiles the whole program before executing it.
- .Ip * 4 2
- The arguments are available via @ARGV, not $1, $2, etc.
- .Ip * 4 2
- The environment is not automatically made available as variables.
- .SH ERRATA\0AND\0ADDENDA
- The Perl book,
- .I Programming\0Perl ,
- has the following omissions and goofs.
- .PP
- On page 5, the examples which read
- .nf
-
- eval "/usr/bin/perl
-
- should read
-
- eval "exec /usr/bin/perl
-
- .fi
- .PP
- On page 195, the equivalent to the System V sum program only works for
- very small files. To do larger files, use
- .nf
-
- undef $/;
- $checksum = unpack("%32C*",<>) % 32767;
-
- .fi
- .PP
- The descriptions of alarm and sleep refer to signal SIGALARM. These
- should refer to SIGALRM.
- .PP
- The
- .B \-0
- switch to set the initial value of $/ was added to Perl after the book
- went to press.
- .PP
- The
- .B \-l
- switch now does automatic line ending processing.
- .PP
- The qx// construct is now a synonym for backticks.
- .PP
- $0 may now be assigned to set the argument displayed by
- .I ps (1).
- .PP
- The new @###.## format was omitted accidentally from the description
- on formats.
- .PP
- It wasn't known at press time that s///ee caused multiple evaluations of
- the replacement expression. This is to be construed as a feature.
- .PP
- (LIST) x $count now does array replication.
- .PP
- There is now no limit on the number of parentheses in a regular expression.
- .PP
- In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
- \el, \eL, \eu, \eU, \eE. The latter five control up/lower case translation.
- .PP
- The
- .B $/
- variable may now be set to a multi-character delimiter.
- .PP
- There is now a g modifier on ordinary pattern matching that causes it
- to iterate through a string finding multiple matches.
- .PP
- All of the $^X variables are new except for $^T.
- .PP
- The default top-of-form format for FILEHANDLE is now FILEHANDLE_TOP rather
- than top.
- .PP
- The eval {} and sort {} constructs were added in version 4.018.
- .PP
- The v and V (little-endian) template options for pack and unpack were
- added in 4.019.
- .SH BUGS
- .PP
- .I Perl
- is at the mercy of your machine's definitions of various operations
- such as type casting, atof() and sprintf().
- .PP
- If your stdio requires an seek or eof between reads and writes on a particular
- stream, so does
- .IR perl .
- (This doesn't apply to sysread() and syswrite().)
- .PP
- While none of the built-in data types have any arbitrary size limits (apart
- from memory size), there are still a few arbitrary limits:
- a given identifier may not be longer than 255 characters,
- and no component of your PATH may be longer than 255 if you use \-S.
- .PP
- .I Perl
- actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
- anyone I said that.
- .rn }` ''
-